Relational operators compare values and are often used when defining new variables and subsets of datasets. Here are the common relational operators in R:
Function | Operator | Example | Example Result |
---|---|---|---|
Equal to | == |
"A" == "a" |
FALSE (because R is case sensitive) Note that == (double equals) is different from = (single equals), which acts like the assignment operator <- |
Not equal to | != |
2 != 0 |
TRUE |
Greater than | > |
4 > 2 |
TRUE |
Less than | < |
4 < 2 |
FALSE |
Greater than or equal to | >= |
6 >= 4 |
TRUE |
Less than or equal to | <= |
6 <= 4 |
FALSE |
Value is missing | is.na() |
is.na(7) |
FALSE (see section on missing values) |
Value is not missing | !is.na() |
!is.na(7) |
TRUE |
Logical operators, such as AND and OR, are often used to connect relational operators and create more complicated criteria. Complex statements might require parentheses ( ) for grouping and order of application.
Function | Operator |
---|---|
AND | & |
OR | | (vertical bar) |
Parentheses | ( ) Used to group criteria together and clarify order |
For example, below, we have a linelist with two variables we want to use to create our case definition, hep_e_rdt
, a test result and other_cases_in_hh
, which will tell us if there are other cases in the household. The command below uses the function case_when()
to create the new variable case_def
such that:
linelist_cleaned <- linelist_cleaned %>%
mutate(case_def = case_when(
is.na(hep_e_rdt) & is.na(other_cases_in_hh) ~ NA_character_,
hep_e_rdt == "Positive" ~ "Confirmed",
hep_e_rdt != "Positive" & other_cases_in_hh == "Yes" ~ "Probable",
TRUE ~ "Suspected"
))
Criteria in example above | Resulting value in new variable “case_def” |
---|---|
If the value for variables hep_e_rdt and other_cases_in_hh are missing |
NA (missing) |
If the value in hep_e_rdt is “Positive” |
“Confirmed” |
If the value in hep_e_rdt is NOT “Positive” AND the value in other_cases_in_hh is “Yes” |
“Probable” |
If one of the above criteria are not met | “Suspected” |
Note that R is case-sensitive, so “Positive” is different than “positive”…
In R, missing values are represented by the special value NA
(capital letters N and A - not in quotation marks). If you import data that records missing data in another way (e.g. 99, “Missing”, or .), you may want to re-code those values to NA
.
To test whether a value is NA
, use the special function is.na()
, which returns TRUE
or FALSE
.
rdt_result <- c("Positive", "Suspected", "Positive", NA) # two positive cases, one suspected, and one unknown
is.na(rdt_result) # Tests whether the value of rdt_result is NA
## [1] FALSE FALSE FALSE TRUE
Mathematical operators are often used to perform addition, division, to create new columns, etc. Below are common mathematical operators in R. Whether you put spaces around the operators is not important.
Objective | Example in R |
---|---|
addition | 2 + 3 |
subtraction | 2 - 3 |
multiplication | 2 * 3 |
division | 30 / 5 |
exponent | 2^3 |
order of operations | ( ) |