In our previous analytics lab,we explored the graphical display of correlation matrix using package “corrplot”. Let us discuss how to combine correlogram with test of significance.
per_i<-read.csv("performance index data.csv",header=TRUE)
head(per_i)
jpi apt tol tech gen
1 47.12 45.18 56.18 52.18 45.45
2 41.89 33.12 34.47 51.68 52.10
3 51.12 56.92 56.28 53.87 53.61
4 40.10 53.43 60.51 47.88 47.71
5 43.54 52.68 52.28 48.34 47.10
6 40.36 41.31 43.75 48.91 42.47
In the dataset considered,we have a set of values for job performance index,aptitude,test of language,technical knowledge,general knowledge.Let us first visualise the correlation matrix
library(corrplot)
cor_per_i<-cor(per_i)
corrplot(cor_per_i,method="circle")
res1 <- cor.mtest(per_i, conf.level = .95)
cor.mtest
carries out the significance test and produces p-values and confidence intervals for each pair of input features.
cor.mtest
creates a list with p- values, lower confidence limit and upper confidence limit which can be accessed using res1$p
, res1$lowCI
and res1uppCI
.
corrplot(cor_per_i, p.mat = res1$p, sig.level = .05)
This plot gives the correlation matrix along with the significant test result.
p.mat
is the matrix of p-value.
sig.level
is the significant level at which the test is to be carried out. The crosses signify that the p-value for the corresrponding pair is insignificant (ie, the p-value greateris than the specified sig.level
)
corrplot(cor_per_i, type="upper", order="hclust",
p.mat = res1$p, sig.level = 0.05, insig = "blank")
we can customise the correlation matrix by defining the type
, order
, insig
,sig.level
.
insig=
specifies the display pattern of the insignificant pair of variables.
insign ="blank"
will leave the leave a blank for no significant coefficient
corrplot(cor_per_i, p.mat = res1$p, insig = "p-value",sig.level = 0.05)
insig="p-value"
will add p-values for the no significant coefficient
To add all the p-values assign insig= "-1"