This article will introduce 43 commonly used Excel functions for data analysis and their purposes.
About Functions:Excel functions are essentially complex formulas that handle intricate calculations. By entering relevant parameters in the proper format, you can obtain results. For example, to find the sum of a range, you can simply use SUM(A1:C100)
So, there's no need to memorize functions verbatim; just understand, for instance, that "to select columns, use Left/Right/Mid" functions and know which parameters are needed. For more complex tasks, let Google handle it.
Function Categories:
Lookup and Reference
Data Cleaning
Logical Operations
Statistical Calculations
Time Series
I. Lookup and Reference Functions
Often, the required data is not in the same Excel sheet or even in the same workbook, making copying cumbersome and inaccurate. How do we integrate this data? These functions are used for multi-table references or row/column comparisons, especially in complex tables.
The functions HLOOKUP and VLOOKUP are used to find data in a table.
VLOOKUP
Function: Looks for an element that meets the condition in the first column.
Syntax:
=VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])
HLOOKUP
Function: Searches the top row or an array for a value and returns a value in the same column from the specified row.
Syntax:
=HLOOKUP(lookup_value, table_array, row_index_num, [range_lookup])
Difference: HLOOKUP returns a value in the same column, while VLOOKUP returns a value in the same row.
INDEX
Function: Returns a value or reference from a table or range.
Syntax:
=INDEX(array, row_num, [column_num])
MATCH
Function: Returns the position of a specified value in a given range (row or column).
Syntax:
=MATCH(lookup_value, lookup_array, [match_type])
RANK
Function: Finds the rank of a value within a set of values in a specified range.
Syntax:
=RANK(number, ref, [order])
ROW
Function: Returns the row number of a cell.
COLUMN
Function: Returns the column number of a cell.
OFFSET
Function: Returns a reference that is offset from a specified base reference by a specified number of rows and columns.
Syntax:
=OFFSET(reference, rows, cols, [height], [width])
II. Data Cleaning Functions
Before processing data, initial cleaning may be necessary, such as removing string spaces, merging cells, replacing text, truncating strings, and finding the position of text.
Trim: Use to remove spaces from both ends of a string.
LTrim: Use to remove spaces from the right side of a string.
RTrim: Use to remove spaces from the left side of a string.
Concatenate: Use to merge the contents of cells.
Left/Right/Mid: Use to truncate strings from the left, right, or middle, respectively.
Replace/Substitute: Use to replace parts of a string.
Find/Search: Use to locate the position of text in a cell.
TRIM
Function: Removes spaces from both ends of a string.
LTRIM
Function: Removes spaces from the right of a string.
RTRIM
Function: Removes spaces from the left of a string.
CONCATENATE
Syntax:
=CONCATENATE(cell1, cell2, ...)
Merges cell contents. An alternative is using
&
, which is faster for many items.LEFT
Function: Extracts a substring from the left.
Syntax:
=LEFT(text, num_chars)
RIGHT
Function: Extracts a substring from the right.
Syntax:
=RIGHT(text, num_chars)
MID
Function: Extracts a substring from the middle.
Syntax:
=MID(text, start_num, num_chars)
REPLACE
Function: Replaces part of a string.
Syntax:
=REPLACE(old_text, start_num, num_chars, new_text)
SUBSTITUTE
Similar to Replace, but based on content rather than position.
FIND
Function: Finds the position of a text string.
Syntax:
=FIND(find_text, within_text, [start_num])
SEARCH
Function: Returns the position of a character or text string in another string, searching from left to right.
Syntax:
=SEARCH(find_text, within_text, [start_num])
Difference: FIND is case-sensitive, while SEARCH is not.
LEN
Function: Counts the number of characters in a text string.
LENB
Function: Returns the number of bytes used by the text.
III. Logical Operations
IF
Function: Returns one value if a condition is true and another if false.
Syntax:
=IF(condition, value_if_true, value_if_false)
AND
Function: Returns TRUE if all conditions are true.
OR
Function: Returns TRUE if at least one condition is true.
IV. Statistical Calculations
Excel has built-in functions for statistical calculations, which are among the most commonly used.
MIN: Finds the minimum value in a range.
MAX: Finds the maximum value in a range.
AVERAGE: Calculates the average in a range.
COUNT: Counts the number of cells that contain numbers.
COUNTIF: Counts the number of cells that meet specified criteria.
COUNTIFS: Counts the number of cells that meet multiple criteria.
SUM: Calculates the total of a range.
SUMIF: Sums cells that meet specified criteria.
SUMIFS: Sums cells that meet multiple criteria.
SUMPRODUCT: Returns the sum of products of corresponding ranges.
MIN
Function: Finds the minimum value in a range.
MAX
Function: Finds the maximum value in a range.
AVERAGE
Function: Calculates the average of a range.
COUNT
Function: Counts cells containing numbers.
COUNTIF
Function: Counts cells that meet specified criteria.
Syntax:
=COUNTIF(range, criteria)
COUNTIFS
Function: Counts cells that meet multiple criteria.
Syntax:
=COUNTIFS(criteria_range1, criteria1, [criteria_range2, criteria2], ...)
SUM
Function: Calculates the sum of a range.
SUMIF
Function: Sums cells that meet specified criteria.
Syntax:
=SUMIF(range, criteria, [sum_range])
SUMIFS
Function: Sums cells that meet multiple criteria.
Syntax:
=SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2, criteria2], ...)
SUMPRODUCT
Function: Returns the sum of products of corresponding ranges.
Syntax:
=SUMPRODUCT(array1, [array2], ...)
STDEV
Function: Calculates standard deviation.
SUBTOTAL
Syntax:
=SUBTOTAL(function_num, ref1, [ref2], ...)
Provides various statistical operations.
INT/ROUND
Function: Rounds numbers. INT rounds down, ROUND rounds based on decimal places.
Example:
ROUND(3.1415, 2) = 3.14
V. Time Series Functions
These functions are specifically designed for handling and converting date formats.
TODAY
Returns today’s date dynamically.
NOW
Returns the current time dynamically.
YEAR
Function: Returns the year from a date.
MONTH
Function: Returns the month from a date.
DAY
Function: Returns the day of a given date as a serial number.
WEEKDAY
Function: Returns the day of the week for a given date. Default values range from 1 (Sunday) to 7 (Saturday).
Syntax:
=WEEKDAY(date, [return_type])
DATEDIF
Function: Calculates the difference in days, months, or years between two dates.
Syntax:
=DATEDIF(start_date, end_date, unit)
Summary
These are some essential Excel functions for data analysis. Understanding how to use them effectively can save you significant time and enhance your analytical capabilities. Rather than trying to memorize each function, focus on understanding their applications and experimenting with them.