Subsetting columns

From the new database I just created, you can see how there are some columns that came with the original databases, which I do not need and do not I want to use. For this, you can create a new database with only the columns you want or delete the columns you do not like. Lets try.

Deleting columns

This is the simplest approach. Basically, I set the column I want to delete to NULL. Like this,

MergeData$X= NULL            #with this code I delete the column called X, which is probably an index number used by one of the data sources

Now, you try to delete the column “continent.x”.

Selecting columns

Obviously, if you have many columns that you do not need, deleting columns may take a while, as you have to type each column name you want to delete. Alternatively, you can simply select the columns you need. Like this,

SelectedColumns=MergeData[, c("country","year",  "gdpPercap", "lifeExp" ) ]

This syntax should be familiar to you already. As it is the indexing function we used earlier [rows, columns].

If you read the code above, basically….I created a new database called SelectedColumns, which takes the database MergeData and select all rows (because there is nothing to the left of the comma) and the four columns listed in the vector after the comma.

head(SelectedColumns)
##       country year gdpPercap lifeExp
## 1 Afghanistan 1952  779.4453  28.801
## 2 Afghanistan 1957  820.8530  30.332
## 3 Afghanistan 1962  853.1007  31.997
## 4 Afghanistan 1967  836.1971  34.020
## 5 Afghanistan 1972  739.9811  36.088
## 6 Afghanistan 1977  786.1134  38.438

f