Nov 28

10 min read

Is this the carbon atom you are looking for? -AI to predict CNT’s atomic coordination.

Technology is all about making things faster and easier. Thanks to our computers, we don’t need to do math, design, or experiment when it comes to science.

One area that really benefits from the great abilities of computers is nanotechnology.

Nanotechnology is working on materials at atomic scales. This is not always easy, fast, or cheap.

Recent advancements are providing new opportunities to simulate nanomaterials. One catch: they are not still so fast.

And we don’t want slow things that need iterations. We have to make rapid developments.

Even though the carbon nanotube (CNT) was discovered in 1991, scientists and entrepreneurs have already found ways to experiment with and commercialize it. We have CNT shields for the military, CNT batteries, or CNT biomarkers…

Through this fast development, Artificial Intelligence was a great support.

In the following sections, I will talk about how we can use Artificial Neural Networks to predict CNT’s atomic coordinates. So if you want to learn how AI and nanotech can be integrated, keep reading!

Carbon Nanotubes

a, b Examples of SEM carbon nanotube images. c, d Examples of TEM carbon nanotube images

A carbon nanotube is a tube made of carbon typically measured in nanometers. Carbon nanotubes (CNTs) are extremely long, thin cylinders that can be made from sheets of carbon atoms bound together in a hexagonal lattice structure.

CNTs possess a unique combination of high stiffness, high strength, low density, small size, and a broad range of electronic properties from metallic to semiconducting.

Some applications of CNTs:

Research has shown that CNTs have the highest reversible capacity of any carbon material for use in lithium-ion batteries. CNTs are outstanding materials for supercapacitor electrodes.
Many researchers (see example) and corporations have already developed CNT-based air and water filtration devices. It has been reported that these filters can not only block the smallest particles but also kill most bacteria. This is another area where CNTs have already been commercialized and products are on the market now.
Research has shown that CNTs can provide a sizable increase in efficiency, even in their current unoptimized state. Solar cells developed at the New Jersey Institute of Technology use a carbon nanotube complex, formed by a mixture of carbon nanotubes and carbon buckyballs to form snake-like structures.

CNTs have two types of shapes. The first one is multi-walled and has a structure of nested tubes. The second type is the basic form of a rolled-up graphitic sheet and is called a single-walled CNT.

A vector connecting the centers of the two hexagons is called the chiral vector, and it determines the structure of a single-walled carbon nanotube.

A carbon nanotube can be specified by a chiral index and be expressed as seen below:

For the vector, n and m are integer chiral indices, and |a1| = |a2| is the lattice constant of graphite. n and m can be numbers to change the structure of the CNT. The lattice constant is an important physical dimension that determines the geometry of the unit cells in a crystal lattice and is proportional to the distance between atoms in the crystal. Crystals are solids that have very orderly structures and properties and are formed when ions, molecules, or atoms join together to share electrons. Crystals have a repeating pattern of atoms, compounds, molecules, or ions arranged in three-dimensional space.

There are also zigzag and armchair carbon nanotubes but in this article, we won’t mention them. Zigzag means that the CNT’s m indices are equal to 0. Armchair means that the CNT’s n and m indices are equal.

Since the discovery of carbon nanotubes (CNT) in 1991, scientists are rapidly researching CNT’s unique features.

CNT’s atomic structures are important as they influence different properties like semiconducting, stiffness, etc.

Simulation programs like CASTEP or VESTA are used to make CNT models with mathematical calculations. However, they need iterations that can make the process of simulating different CNTs longer than it needs to be.

Why modeling CNTs is important?

Carbon nanotubes have different properties when their structure is changed. If you want to build a space elevator from a very cheap, stiff but lightweight material, the solution seems like a CNT. But not every carbon nanotube has the same stiffness. They are not always easy to synthesize. To find the best match for your wishes, using molecular modeling programs is the best option.

However, you might need to do some calculations to make sure your modeling is accurate. These mathematical calculations take time as they need iterations.

Researchers now use ANNs (Artificial Neural Networks) to predict the atomic coordinates of carbon nanotubes, so that these models can be used within modeling programs to build up new CNTs in a short time.

Artificial Neural Networks

An artificial Neural Network (ANN) is a computational model based on the human brain that allows computers to learn and solve problems without any prior knowledge about the subject. ANNs are a set of algorithms designed specifically for machine learning that works as artificial intelligence. The ANNs are able to create new functions from a set of examples called training data, and generalize these functions in order to solve problems not necessarily contained within the training data.

Neural networks are composed of interconnected nodes or neurons. Each node has a weighted input that is fed into an activation function. The output of the activation function is then passed to the next node in the network through a process of trial and error. The pattern recognition power of ANNs improves with more data exposure and training ability.

This study is using ANNs to lower the iterations needed in the CASTEP simulation environment for modeling CNTs.

I was inspired by this paper and decided to build an ANN to predict atomic coordinates with Python.

The paper uses MATLAB for its model and shares some details of its layers. I coded the ANN in Python and did some iterations for the layers.

Predicting Atomic Coordinates of CNTs

1-Import Libraries and Dataset

The first step is importing libraries and the dataset.

We are using the dataset which was created by the previously mentioned paper.

import numpy as np, pandas as pd, seaborn as sns, matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn.metrics import mean_squared_error, r2_score, accuracy_score
from tensorflow.keras.models  import Sequential
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.optimizers import Adam
import datetime
now = datetime.datetime.now
#filter warnings
import warnings
warnings.filterwarnings("ignore")

%matplotlib inline

filepath = 'the_dataset'

raw_data = pd.read_csv(filepath, sep=';',decimal=',')
data = raw_data.copy()

2-Explore the Data

Let’s explore the dataset we have. This way, we can have a better insight of what is the data we have and how we can use it.

data.info()
data.describe()
data.head()
#you can use these to understand the datatypes, observe the 
#statistical summary or to see an example.

data.hist(figsize=(10,10)) #you can take a look at dataset distributions.

We can also create interactive plots to observe the most populated combined indices. (If you are interested in creating more 3D plots, there is a great Kaggle notebook.)

import plotly.express as px


fig = px.scatter_3d(data,
                    x="Calculated atomic coordinates u'",
                    y="Calculated atomic coordinates v'",
                    z="Calculated atomic coordinates w'",
                    color='Chiral indice n', 
                    size='Chiral indice m', 
                    hover_data=[],
                    opacity=0.4)
fig.update_layout(title='Calculated atomic coordinates')

fig.show()

Let’s go over the code block above. The variable “fig” is where we store the coordinates of our CNT. Atomic coordinates u, v, and w can be expressed as x, y, and z in the macroscale. We assign the n indices as the color and the m indices as the size.

3-Process the Data

After exploring our data, we can now separate the x and y data, scale features using MinMaxScaler, and split test-train data.

#y data
y_cols = ["Calculated atomic coordinates u'",
          "Calculated atomic coordinates v'",
          "Calculated atomic coordinates w'"]

#target data 
y_data = data[y_cols]

#copy dataset
X_data = data.copy()

#remove target data from X_data
for y_label in y_cols:
    X_data = X_data.drop([y_label], axis=1)

Feature scaling is a method used to normalize the range of independent variables or features of data.

In MinMaxScaler, for any given feature, the minimum value of that feature gets transformed to 0 while the maximum value will transform to 1 and all other values are normalized between 0 and 1.

To see which rows in encoded data require scaling, we are looking for the minimum and maximum values.

scale_cols = [col for col in X_data.columns 
              if X_data[col].min() < -1 
              or X_data[col].max() > 1]

scale_cols

And we transform the max into 1 and the min into 0. Finally, the others are scaled between 0 and 1.

X_data[scale_cols].iloc[:].min()
X_data[scale_cols].iloc[:].max()
from sklearn.preprocessing import MinMaxScaler

mm = MinMaxScaler() 

X_data[scale_cols] = mm.fit_transform(X_data[scale_cols])

We are splitting our data (70% train and 30% test).

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X_data,
                                                    y_data, 
                                                    test_size=0.3,
                                                    random_state=42)

4-Build the ANN model

Now, it is time to build our model. As we have one input and output in each layer, we are using Sequential().

Our model has 3 hidden layers.

Layer 1: 20 hidden nodes, hyperbolic tangent activation
Layer 2: 30 hidden nodes, hyperbolic tangent activation
Layer 3: 25 hidden nodes, softmax activation
The final layer has 3-nodes with no activation

Why did we use hyperbolic tangent activation?

We could also use “Sigmoid”.

They are the same, up to translation and scaling. The logistic sigmoid has a range of 0 to 1 while the hyperbolic tangent has a range of −1 to 1. And our data is scaled between 0 and 1. We use tangent because it is computationally more efficient.

For layer 3, we have “Softmax” activation. The Softmax activation function calculates the relative probabilities.

You can tease up the layers and investigate the results.

model = Sequential()
model.add(Dense(20, input_shape = (5,), activation='tanh'))
model.add(Dense(30, activation = 'tanh'))
model.add(Dense(25, activation = 'softmax'))
model.add(Dense(3, activation=None))

Let’s add Adam optimizer.

Adam optimizer is one of the most widely used optimizers for training the neural network and is also used for practical purposes. It is very efficient with large problems which consist of a large number of data. It is a method that computes adaptive learning rates for each parameter.

model.compile(Adam(lr = 0.0015),
                'mean_squared_error')

Time to run our code! You can increase or decrease the epoch number and see how the performance was influenced.

run = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=320)

5-Observe the Performance

Time to see the performance metrics.

# predictions for train and test 
y_train_pred = model.predict(X_train)
y_test_pred = model.predict(X_test)

# R2 score for train and test 
train_score = r2_score(y_train, y_train_pred)
test_score = r2_score(y_test, y_test_pred)

print("_________________________________________________")
print("R2 Score for the training set is:", train_score)
print("R2 Score for the test set is:", test_score)

R2 score

What is an R2 Score? R2 score (Coefficient of determination) is the amount of variation in the output-dependent attribute which is predictable from the input independent variables.

Our R2 Score for the test set is 0.99. That means 99% of the changeability of the dependent output attribute can be explained by the model while 1% is still unaccounted for.

So our performance looks good! By the metrics definitions, we can say that our model was successful.

Conclusion

Wooh! This was a lot. We first learned about what CNTs are and why modeling them is important, then how ANNs can be used to predict their atomic coordinates.

Some key points:

Carbon nanotubes (CNTs) are newly discovered nanomaterials that have unique properties.
CNT’s properties are influenced by its shape and structure. So modeling and simulating their structure plays an important role for scientists to try new synthesis methods.
Simulating the CNTs currently takes a lot of time as mathematical calculations are needed to be done for accurate atomic coordinates.
This study uses Artificial Neural Networks to predict atomic coordinates so iterations can be done by the simulation software without the need for mathematical calculations.
The model we built by getting inspiration from Acı and Avcı’s study is showing high performance.

Thank you for reading my article! If you enjoyed it, you can clap the article and follow my account for more.

Hi! I’m Elanu, 16 years old from Turkey who writes about STEM.