R has five basic or "atomic" classes of objects:
- character
- numeric (real numbers)
- integer
- complex
- logical (True/False)
The most basic object is a vector
- A vector can only contain objects of the same class
- BUT: The one exception is a list, which represented as a vector but can contain different objects of different classes.
- Empty vector can be created with the vector() function.
R objects can have attributes
- names, dimnames
- dimension (e.g. matrices, arrays)
- class
- length
- other user-defined attributes/metadata
Attributes of an object can be access using attributes() function.
Factors are used to represent categorical data. Factors can be unordered or ordered. One can think of a factor as an integer vector where each integer has a label.
- Factors are treated specially by modelling functions like lm() and glm()
- Using factors with labels is better than using integers because factors are self-describing.
Data frames are used to store tabular data
- They are represented as a special type of list where every element of the list has to have the same length
- Unlike matrices, data frames can store different classes of objects in each column
- Data frames also have a special attribute called row.names
- Data frames are usually created by calling read.table() or read.csv()
- Can be converted to a matrix by calling data.matrix()
Here are some basic commands for practicing:
1. Assignment operator
2. Create vector of objects with
c() function
3. Coerce from one class to another using
as.* functions, if available
4. Create matrices
5. Create list
6. Create factors
7. Check for missing values
8. Create data frames
9. Name R object
Have fun!