Machine Learning. Artificial Neural Networks (Strength of Concrete).

By Data Scientist PakinJa

(This article was first published on Data R Value, and kindly contributed to R-bloggers)

It is important to mention that the present posts series began as a personal way of practicing R programming and machine learning. Subsequently feedback from the community, urged me to continue performing these exercises and sharing them. The bibliography and corresponding authors are cited at all times and this posts series is a way of honoring and giving them the credit they deserve for their work.

We will develop an artificial neural network example. The example was originally published in “Machine Learning in R” by Brett Lantz, PACKT publishing 2015 (open source community experience destilled).


The example we will develop is about predicting the strength of concrete based in the ingredients used to made it.

We will carry out the exercise verbatim as published in the aforementioned reference.

For more details on the model trees and regression trees algorithms it is recommended to check the aforementioned reference or any other bibliography of your choice.

### “Machine Learning in R” by Brett Lantz,
### PACKT publishing 2015
### (open source community experience destilled)
### based on: Yeh IC. “Modeling of Strength of
### High Performance Concrete Using Artificial
### Neural Networks.” Cement and Concrete Research
### 1998; 28:1797-1808.

### Strength of concrete example
### relationship between the ingredients used in
### concrete and the strength of finished product

### Dataset
### Compressive strength of concrete
### UCI Machine Learning Data Repository
### http://archive.ics.uci.edu/ml

### install an load required packages
#install.packages(“neuralnet”)

library(neuralnet)

### read and explore the data
concrete str(concrete)

### neural networks work best when the input data
### are scaled to a narrow range around zero

### normalize the dataset values
normalize return((x – min(x)) / (max(x) – min(x)) )
}

### apply normalize() to the dataset columns
concrete_norm

### confirm and compare normalization
summary(concrete_norm$strength)
summary(concrete$strength)

### split the data into training and testing sets
### 75% – 25%

concrete_train concrete_test

### training model on the data
concrete_model ash + water + superplastic + coarseagg +
fineagg + age, data = concrete_train)

### visualize the network topology

plot(concrete_model)


### there is one input node for each of the eight
### features, followed by a single hidden node and
### a single output node that predicts the concrete
### strength
### at the bottom of the figure, R reports the number
### of training steps and an error measure called the
### the sum of squared errors (SSE)

### evaluating model performance

### predictions
model_results predicted_strength

### because this is a numeric prediction problem rather
### than a classification problem, we cannot use a confusion
### matrix to examine model accuracy
### obtain correlation between our predicted concrete strength
### and the true value

cor(predicted_strength, concrete_test$strength)

### correlation indicate a strong linear relationships between
### two variables

### improving model performance
### increase the number of hidden nodes to five
concrete_model2 ash + water + superplastic + coarseagg +
fineagg + age, data = concrete_train, hidden = 5)

plot(concrete_model2)


### SSE has been reduced significantly

### predictions
model_results2 predicted_strength2

### performance
cor(predicted_strength, concrete_test$strength)

### notice that results can differs because neuralnet
### begins with random weights
### if you’d like to match results exactly, use set.seed(12345)
### before building the neural network


You can get the example and the dataset in:
https://github.com/pakinja/Data-R-Value

To leave a comment for the author, please follow the link and comment on their blog: Data R Value.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…

Source:: R News

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.