**Data R Value**, and kindly contributed to R-bloggers)

It is important to mention that the present posts series began as a personal way of practicing R programming and machine learning. Subsequently feedback from the community, urged me to continue performing these exercises and sharing them. The bibliography and corresponding authors are cited at all times and this posts series is a way of honoring and giving them the credit they deserve for their work.

We will develop an artificial neural network example. The example was originally published in *“Machine Learning in R” *by Brett Lantz, PACKT publishing 2015 (open source community experience destilled).

The example we will develop is about predicting the strength of concrete based in the ingredients used to made it.

We will carry out the exercise verbatim as published in the aforementioned reference.

For more details on the model trees and regression trees algorithms it is recommended to check the aforementioned reference or any other bibliography of your choice.

**### “Machine Learning in R” by Brett Lantz,### PACKT publishing 2015### (open source community experience destilled)### based on: Yeh IC. “Modeling of Strength of### High Performance Concrete Using Artificial### Neural Networks.” Cement and Concrete Research### 1998; 28:1797-1808.**

### Strength of concrete example

### relationship between the ingredients used in

### concrete and the strength of finished product

### Dataset

### Compressive strength of concrete

### UCI Machine Learning Data Repository

### http://archive.ics.uci.edu/ml

### install an load required packages

#install.packages(“neuralnet”)

library(neuralnet)

**### read and explore the data**

concrete str(concrete)

**### neural networks work best when the input data### are scaled to a narrow range around zero**

### normalize the dataset values

normalize return((x – min(x)) / (max(x) – min(x)) )

}

**### apply normalize() to the dataset columns**

concrete_norm

**### confirm and compare normalization**

summary(concrete_norm$strength)

summary(concrete$strength)

**### split the data into training and testing sets### 75% – 25%**

concrete_train concrete_test

**### training model on the data**

concrete_model ash + water + superplastic + coarseagg +

fineagg + age, data = concrete_train)

### visualize the network topology

plot(concrete_model)

**### there is one input node for each of the eight### features, followed by a single hidden node and### a single output node that predicts the concrete### strength### at the bottom of the figure, R reports the number### of training steps and an error measure called the### the sum of squared errors (SSE)**

### evaluating model performance

### predictions

model_results predicted_strength

**### because this is a numeric prediction problem rather### than a classification problem, we cannot use a confusion### matrix to examine model accuracy### obtain correlation between our predicted concrete strength### and the true value**

cor(predicted_strength, concrete_test$strength)

**### correlation indicate a strong linear relationships between ### two variables**

### improving model performance

### increase the number of hidden nodes to five

concrete_model2 ash + water + superplastic + coarseagg +

fineagg + age, data = concrete_train, hidden = 5)

plot(concrete_model2)

**### SSE has been reduced significantly**

**### predictions**

model_results2 predicted_strength2

**### performance**

cor(predicted_strength, concrete_test$strength)

**### notice that results can differs because neuralnet### begins with random weights### if you’d like to match results exactly, use set.seed(12345)### before building the neural network **

You can get the example and the dataset in:

https://github.com/pakinja/Data-R-Value

** **

**leave a comment**for the author, please follow the link and comment on their blog:

**Data R Value**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…

Source:: R News