Top 100 Python for Data Analysis MCQs

Python for data analysis is getting popular over time because first, it is easy to learn and use. Then it has lots of powerful libraries for data analysis tasks like NumPy for fast calculations, Pandas for handling and analyzing data, and Matplotlib for data visualization. Plus, it works well with large data and connects easily with AI tools, helping analysts work faster and smarter.

Why Practice Python for Data Analysis MCQs?

Practising Multiple Choice Questions (MCQs) is one of the best ways to test your foundational knowledge. While writing code is important, understanding the theoretical concepts ensures you can apply them with confidence. This collection of “Top 100 Python for Data Analysis MCQs” is designed to help you:

Assess your understanding of core data analysis libraries.
Prepare for technical interviews and coding rounds.
Excel in academic quizzes and competitive government exams.
Identify gaps in your knowledge regarding data manipulation and cleaning.

What is Covered in This Python Data Analysis Quiz?

Data analysis in Python is broad and involves multiple libraries and tools. So we tried to cover all the essential concepts suitable for data analysis by combining questions from the following key domains:

Pandas DataFrames and Series: Questions focusing on data structures, indexing, slicing, and data manipulation.
Data Cleaning and Preparation: Scenarios involving handling missing values, duplicates, string manipulation, and data type conversions.
NumPy Arrays: Fundamental questions on array creation, vectorization, mathematical operations, and broadcasting.
Aggregation and Grouping: Testing your knowledge on grouping data, pivot tables, and statistical summaries.
Data Import/Export: Methods for reading and writing data formats like CSV, Excel, JSON, and SQL.

100 Python for Data Analysis MCQs with Answers

Q1. Which Python library is considered the most fundamental for data manipulation and analysis in Python?

A. NumPy
B. Pandas
C. Matplotlib
D. SciPy

Show Answer

Answer: B
Pandas is widely recognized as the primary library for data manipulation and analysis in Python.

Q2. What is the primary data structure in Pandas used to represent a two-dimensional labeled array?

A. Series
B. DataFrame
C. Array
D. Matrix

Show Answer

Answer: B
A DataFrame is a 2-dimensional labeled data structure with columns of potentially different types.

Q3. Which method is used to read a CSV file into a Pandas DataFrame?

A. pd.read_csv()
B. pd.load_csv()
C. pd.import_csv()
D. pd.open_csv()

Show Answer

Answer: A
The pd.read_csv() function is the standard way to read comma-separated values files into a DataFrame.

Q4. How do you display the first 5 rows of a DataFrame named ‘df’?

A. df.head(5)
B. df.first(5)
C. df.top(5)
D. df.show(5)

Show Answer

Answer: A
The head() method returns the first n rows of a DataFrame, defaulting to 5 if no argument is provided.

Q5. Which Pandas method is used to get a quick statistical summary of the data in a DataFrame?

A. df.info()
B. df.describe()
C. df.summary()
D. df.stats()

Show Answer

Answer: B
The describe() method generates descriptive statistics summarizing the central tendency, dispersion, and shape.

Q6. What does the pd.Series data structure represent in Pandas?

A. A two-dimensional table
B. A one-dimensional labeled array
C. A three-dimensional matrix
D. An unordered collection of items

Show Answer

Answer: B
A Series is a one-dimensional labeled array capable of holding any data type.

Q7. Which function is used to identify missing values in a DataFrame?

A. df.missing()
B. df.isnull()
C. df.nan()
D. df.check_null()

Show Answer

Answer: B
The isnull() function returns a boolean same-sized object indicating if values are missing (NaN).

Q8. How can you drop rows with missing values in a Pandas DataFrame?

A. df.dropna()
B. df.remove_na()
C. df.delete_nan()
D. df.clean()

Show Answer

Answer: A
The dropna() method removes missing values by dropping rows or columns containing them.

Q9. Which method is used to fill missing values with a specific number or string?

A. df.fillna()
B. df.replace()
C. df.update()
D. df.insert()

Show Answer

Answer: A
The fillna() method fills NA/NaN values using the specified method like a constant value or mean.

Q10. How do you select a column named ‘Age’ from a DataFrame ‘df’?

A. df[‘Age’]
B. df.Age
C. Both A and B
D. df(Age)

Show Answer

Answer: C
You can select a column using bracket notation df[‘Age’] or attribute access df.Age if the name is a valid identifier.

Q11. Which method is used to group data in Pandas for aggregation?

A. df.aggregate()
B. df.group_by()
C. df.groupby()
D. df.cluster()

Show Answer

Answer: C
The groupby() method involves splitting the object, applying a function, and combining the results.

Q12. Which function is used to merge two DataFrames based on a common column?

A. pd.merge()
B. pd.join()
C. pd.concat()
D. pd.combine()

Show Answer

Answer: A
pd.merge() is used to merge DataFrame or named Series objects with a database-style join.

Q13. What is the purpose of the iloc indexer in Pandas?

A. To select data by label
B. To select data by integer position
C. To select data by boolean condition
D. To select data by column name

Show Answer

Answer: B
iloc is primarily integer position based (from 0 to length-1 of the axis), unlike loc which is label-based.

Q14. Which method is used to sort a DataFrame by the values of a specific column?

A. df.order()
B. df.sort_values()
C. df.sort()
D. df.rank()

Show Answer

Answer: B
sort_values() sorts a DataFrame along either axis by the values of the specified column.

Q15. How do you get the number of rows and columns in a DataFrame ‘df’?

A. df.shape
B. df.size
C. df.dimensions
D. df.length

Show Answer

Answer: A
The shape attribute returns a tuple representing the dimensionality (rows, columns) of the DataFrame.

Q16. Which library is primarily used for scientific computing and working with arrays in Python?

A. Pandas
B. NumPy
C. Matplotlib
D. Seaborn

Show Answer

Answer: B
NumPy is the fundamental package for scientific computing with Python, providing a powerful N-dimensional array object.

Q17. What is the name of the main object in NumPy?

A. List
B. DataFrame
C. ndarray
D. Series

Show Answer

Answer: C
The ndarray is a multidimensional, homogeneous array of fixed-size items in NumPy.

Q18. Which function creates an array filled with zeros?

A. np.empty()
B. np.zeros()
C. np.ones()
D. np.full()

Show Answer

Answer: B
np.zeros() returns a new array of given shape and type, filled with zeros.

Q19. How do you check the data type of elements in a NumPy array ‘arr’?

A. arr.type
B. arr.dtype
C. arr.datatype
D. arr.kind

Show Answer

Answer: B
The dtype attribute returns the data-type object associated with the array.

Q20. Which method is used to change the shape of a NumPy array?

A. arr.reshape()
B. arr.resize()
C. arr.shape()
D. arr.modify()

Show Answer

Answer: A
reshape() gives a new shape to an array without changing its data.

Q21. Which Pandas method returns the data types of each column in a DataFrame?

A. df.types
B. df.dtypes
C. df.info()
D. df.columns

Show Answer

Answer: B
The dtypes attribute returns a Series with the data type of each column.

Q22. Which function is used to concatenate Pandas objects along a particular axis?

A. pd.merge()
B. pd.append()
C. pd.concat()
D. pd.join()

Show Answer

Answer: C
pd.concat() concatenates Pandas objects along a particular axis with optional set logic.

Q23. How do you rename a column ‘Old_Name’ to ‘New_Name’ in a DataFrame?

A. df.rename(columns={‘Old_Name’: ‘New_Name’})
B. df.column(‘Old_Name’, ‘New_Name’)
C. df.change(‘Old_Name’, ‘New_Name’)
D. df.set_column(‘New_Name’)

Show Answer

Answer: A
The rename() method alters axes labels, allowing specific column renaming via a dictionary.

Q24. What is the output of df[‘Score’].value_counts()?

A. Sum of scores
B. Mean of scores
C. Unique values and their counts
D. Sorted values

Show Answer

Answer: C
value_counts() returns a Series containing counts of unique values in descending order.

Q25. Which method removes duplicate rows from a DataFrame?

A. df.drop_duplicates()
B. df.remove_duplicates()
C. df.unique()
D. df.nunique()

Show Answer

Answer: A
drop_duplicates() returns DataFrame with duplicate rows removed.

Q26. Which function is used to create a histogram of a DataFrame column?

A. df.plot.hist()
B. df.graph()
C. df.histogram()
D. df.show_hist()

Show Answer

Answer: A
The plot.hist() method draws one histogram of the DataFrame’s columns.

Q27. How do you filter rows where the column ‘Age’ is greater than 30?

A. df[df[‘Age’] > 30]
B. df.where(‘Age’ > 30)
C. df.filter(‘Age’ > 30)
D. df.loc(‘Age’ > 30)

Show Answer

Answer: A
You can filter a DataFrame by passing a boolean condition inside the indexing operator.

Q28. Which NumPy function is used to generate an array of evenly spaced numbers?

A. np.linspace()
B. np.arange()
C. np.range()
D. Both A and B

Show Answer

Answer: D
np.linspace() generates evenly spaced numbers over an interval, while np.arange() uses a step size.

Q29. How do you apply a custom function to every element in a DataFrame?

A. df.map()
B. df.apply()
C. df.applymap()
D. df.element()

Show Answer

Answer: C
applymap() applies a function to a DataFrame element-wise, whereas apply() works on rows/columns.

Q30. What does the df.info() method provide?

A. Statistical summary
B. Summary of the DataFrame including index dtype and columns
C. First 10 rows
D. Data types only

Show Answer

Answer: B
info() prints a concise summary of a DataFrame, including memory usage and non-null counts.

Q31. Which method is used to create a scatter plot in Pandas?

A. df.plot.scatter()
B. df.scatter()
C. df.plot(kind=’scatter’)
D. Both A and C

Show Answer

Answer: D
You can create a scatter plot using df.plot.scatter() or df.plot(kind=’scatter’).

Q32. How do you convert a DataFrame ‘df’ to a NumPy array?

A. df.to_numpy()
B. df.to_array()
C. df.values
D. Both A and C

Show Answer

Answer: D
Both to_numpy() (recommended) and the values attribute return the NumPy representation of the DataFrame.

Q33. Which method is used to handle duplicates by keeping only the first occurrence?

A. df.duplicated(keep=’first’)
B. df.drop_duplicates(keep=’first’)
C. df.unique(keep=’first’)
D. df.keep_first()

Show Answer

Answer: B
drop_duplicates(keep=’first’) removes duplicates, keeping the first instance by default.

Q34. What is the function to set a specific column as the index of a DataFrame?

A. df.set_column()
B. df.set_index()
C. df.index()
D. df.change_index()

Show Answer

Answer: B
set_index() sets the DataFrame index using one or more existing columns.

Q35. Which NumPy function calculates the mean of an array?

A. np.avg()
B. np.mean()
C. np.average()
D. np.median()

Show Answer

Answer: B
np.mean() computes the arithmetic mean along the specified axis.

Q36. How do you write a DataFrame to a CSV file named ‘output.csv’?

A. df.write_csv(‘output.csv’)
B. df.to_csv(‘output.csv’)
C. df.save_csv(‘output.csv’)
D. df.export_csv(‘output.csv’)

Show Answer

Answer: B
The to_csv() method writes the DataFrame to a comma-separated values (csv) file.

Q37. Which function is used to perform an inner join in Pandas?

A. pd.merge(how=’inner’)
B. pd.merge(how=’left’)
C. pd.concat(join=’inner’)
D. pd.join(type=’inner’)

Show Answer

Answer: A
pd.merge() with how=’inner’ uses the intersection of keys from both frames, similar to a SQL inner join.

Q38. Which method returns the unique values in a Series?

A. series.unique()
B. series.nunique()
C. series.duplicates()
D. series.distinct()

Show Answer

Answer: A
unique() returns the unique values in the order of appearance, while nunique() returns the count.

Q39. How do you transpose a DataFrame ‘df’?

A. df.T
B. df.transpose()
C. df.flip()
D. Both A and B

Show Answer

Answer: D
Both the T attribute and the transpose() method swap rows and columns.

Q40. What does the axis=1 parameter mean in a DataFrame operation?

A. Apply the function along rows
B. Apply the function along columns
C. Apply the function to the whole DataFrame
D. Apply the function to the index

Show Answer

Answer: B
axis=1 represents columns (vertical axis), meaning the operation is applied column-wise or across columns.

Q41. Which Pandas function reads data from an Excel file?

A. pd.read_excel()
B. pd.read_xls()
C. pd.load_excel()
D. pd.open_excel()

Show Answer

Answer: A
read_excel() reads an Excel file into a Pandas DataFrame.

Q42. Which NumPy function generates random numbers from a standard normal distribution?

A. np.random.rand()
B. np.random.randn()
C. np.random.randint()
D. np.random.normal()

Show Answer

Answer: B
np.random.randn() returns a sample from the “standard normal” distribution.

Q43. How do you get the total number of non-null values in each column?

A. df.null_count()
B. df.notnull().sum()
C. df.count()
D. Both B and C

Show Answer

Answer: D
count() returns the number of non-NA/null observations, which is equivalent to notnull().sum().

Q44. What is the function of the df.pivot_table() method?

A. To reshape data from wide to long format
B. To create a spreadsheet-style pivot table as a DataFrame
C. To stack columns
D. To unstack rows

Show Answer

Answer: B
pivot_table() creates a spreadsheet-style pivot table containing summarized data.

Q45. How do you convert a list into a NumPy array?

A. np.to_array(list)
B. np.array(list)
C. np.create(list)
D. np.list_to_array(list)

Show Answer

Answer: B
np.array() creates an array from any object exposing the array interface, like a list.

Q46. Which method is used to find the correlation between columns in a DataFrame?

A. df.corr()
B. df.relate()
C. df.cov()
D. df.connect()

Show Answer

Answer: A
corr() computes pairwise correlation of columns, excluding NA/null values.

Q47. How do you iterate over the rows of a DataFrame?

A. df.rows()
B. df.iterrows()
C. df.itertuples()
D. Both B and C

Show Answer

Answer: D
iterrows() iterates as (index, Series) pairs, while itertuples() iterates as namedtuples (faster).

Q48. Which function is used to stack the prescribed level(s) from columns to index?

A. df.stack()
B. df.unstack()
C. df.melt()
D. df.pivot()

Show Answer

Answer: A
stack() “compresses” a level in the DataFrame’s columns to the index.

Q49. Which method converts a “wide” format DataFrame to a “long” format?

A. df.pivot()
B. df.melt()
C. df.wide_to_long()
D. Both B and C

Show Answer

Answer: D
melt() and wide_to_long() are used to reshape data from wide to long format.

Q50. What does np.dot() do in NumPy?

A. Performs element-wise multiplication
B. Performs matrix multiplication
C. Calculates the dot product of two arrays
D. Both B and C

Show Answer

Answer: D
np.dot() handles both dot products for vectors and matrix multiplication for 2-D arrays.

Q51. How can you delete a column named ‘City’ from a DataFrame?

A. df.drop(‘City’, axis=1)
B. del df[‘City’]
C. df.pop(‘City’)
D. All of the above

Show Answer

Answer: D
All three methods can be used to remove a column, though drop returns a new object by default.

Q52. Which method creates a Series from a dictionary?

A. pd.Series(dict)
B. pd.to_series(dict)
C. pd.create_series(dict)
D. pd.from_dict(dict)

Show Answer

Answer: A
Pandas automatically converts a dictionary to a Series where keys become the index.

Q53. What is the output of np.arange(0, 10, 2)?

A. [0, 2, 4, 6, 8]
B. [0, 2, 4, 6, 8, 10]
C. [2, 4, 6, 8]
D. [0, 1, 2, 3, 4]

Show Answer

Answer: A
np.arange(start, stop, step) generates values within the half-open interval [start, stop).

Q54. Which function converts a string column to datetime objects?

A. pd.to_datetime()
B. pd.datetime()
C. pd.convert_datetime()
D. pd.date_parse()

Show Answer

Answer: A
to_datetime() converts argument to datetime, handling various string formats.

Q55. How do you get the memory usage of each column in a DataFrame?

A. df.memory_usage()
B. df.memory()
C. df.size()
D. df.info(memory=True)

Show Answer

Answer: A
memory_usage() returns the memory usage of each column in bytes.

Q56. Which method checks if a DataFrame index has duplicate values?

A. df.index.is_unique
B. df.index.duplicated()
C. df.is_duplicate()
D. Both A and B

Show Answer

Answer: A
is_unique attribute returns a boolean indicating if the index has unique values.

Q57. Which NumPy method reverses a multi-dimensional array?

A. np.reverse()
B. np.flip()
C. np.invert()
D. arr[::-1]

Show Answer

Answer: B
np.flip() reverses the order of elements in an array along the given axis.

Q58. How do you extract the day of the week from a datetime column ‘Date’?

A. df[‘Date’].dt.day
B. df[‘Date’].dt.dayofweek
C. df[‘Date’].day
D. df[‘Date’].weekday

Show Answer

Answer: B
The dt accessor allows you to access datetime properties like dayofweek from a Series.

Q59. Which function calculates the cumulative sum of a Series?

A. series.sum()
B. series.cumsum()
C. series.rolling_sum()
D. series.total()

Show Answer

Answer: B
cumsum() returns a Series of the cumulative sum of the elements.

Q60. Which method is used to replace values in a DataFrame?

A. df.fillna()
B. df.replace()
C. df.substitute()
D. df.swap()

Show Answer

Answer: B
replace() replaces values given in ‘to_replace’ with ‘value’.

Q61. How can you detect outliers using the IQR method in Pandas?

A. df.quantile()
B. df.describe()
C. Using IQR calculation with quantile()
D. df.outliers()

Show Answer

Answer: C
You calculate Q1 and Q3 using quantile() and filter values outside the 1.5 * IQR range.

Q62. Which function reads a JSON string or file into a DataFrame?

A. pd.read_json()
B. pd.read_js()
C. pd.load_json()
D. pd.parse_json()

Show Answer

Answer: A
read_json() converts a JSON string or file to a Pandas object.

Q63. What does the loc indexer do?

A. Selects by integer position
B. Selects by label
C. Selects by condition
D. Both B and C

Show Answer

Answer: D
loc is primarily label-based, but it also accepts a boolean array for conditional filtering.

Q64. How do you change the data type of a column ‘Price’ to float?

A. df[‘Price’].astype(float)
B. df[‘Price’].to_float()
C. df[‘Price’].convert(float)
D. df[‘Price’].type = float

Show Answer

Answer: A
astype() casts a Pandas object to a specified dtype.

Q65. Which NumPy function returns the indices of the maximum value?

A. np.max()
B. np.argmax()
C. np.maximum()
D. np.where_max()

Show Answer

Answer: B
np.argmax() returns the indices of the maximum values along an axis.

Q66. Which method is used for window-based calculations like moving averages?

A. df.window()
B. df.rolling()
C. df.shift()
D. df.expanding()

Show Answer

Answer: B
rolling() provides rolling window calculations, commonly used for moving averages.

Q67. How do you drop a column permanently in a DataFrame?

A. df.drop(‘col’, inplace=True)
B. df = df.drop(‘col’)
C. df.remove(‘col’)
D. Both A and B

Show Answer

Answer: D
You can either set inplace=True or reassign the result to the variable.

Q68. Which method creates a frequency table cross-tabulation?

A. pd.crosstab()
B. pd.freq_table()
C. df.pivot()
D. df.tabulate()

Show Answer

Answer: A
pd.crosstab() computes a simple cross tabulation of two (or more) factors.

Q69. What is used to convert categorical variables into dummy variables?

A. pd.get_dummies()
B. pd.encode()
C. pd.factorize()
D. pd.categorical()

Show Answer

Answer: A
get_dummies() converts categorical variable into dummy/indicator variables (One-Hot Encoding).

Q70. Which NumPy function saves an array to a binary file?

A. np.save()
B. np.write()
C. np.dump()
D. np.export()

Show Answer

Answer: A
np.save() saves an array to a binary file in NumPy .npy format.

Q71. How do you check if two DataFrames are equal?

A. df1 == df2
B. df1.equals(df2)
C. df1.compare(df2)
D. df1.is_equal(df2)

Show Answer

Answer: B
equals() compares two DataFrames element-wise and returns True if they are identical.

Q72. Which method is used to find the index of the minimum value in a Series?

A. series.min()
B. series.idxmin()
C. series.argmin()
D. Both B and C

Show Answer

Answer: B
idxmin() returns the row label of the minimum value; argmin() returns the position (deprecated for Series).

Q73. How do you sample random rows from a DataFrame?

A. df.sample()
B. df.random()
C. df.select()
D. df.pick()

Show Answer

Answer: A
sample() returns a random sample of items from an axis of object.

Q74. Which function is used to create a NumPy array filled with ones?

A. np.zeros()
B. np.ones()
C. np.empty()
D. np.full()

Show Answer

Answer: B
np.ones() returns a new array of given shape and type, filled with ones.

Q75. What attribute returns the column labels of a DataFrame?

A. df.index
B. df.columns
C. df.rows
D. df.labels

Show Answer

Answer: B
The columns attribute returns an Index object containing the column labels.

Q76. How do you count the number of unique values in a column?

A. df[‘col’].unique()
B. df[‘col’].nunique()
C. df[‘col’].count()
D. df[‘col’].value_counts()

Show Answer

Answer: B
nunique() returns the number of distinct elements in the object.

Q77. Which function reads data from a SQL database into a DataFrame?

A. pd.read_sql()
B. pd.read_db()
C. pd.read_table()
D. pd.import_sql()

Show Answer

Answer: A
read_sql() reads SQL query or database table into a DataFrame.

Q78. Which method is used to reset the index of a DataFrame?

A. df.reset_index()
B. df.reindex()
C. df.set_index()
D. df.new_index()

Show Answer

Answer: A
reset_index() resets the index of the DataFrame, and uses the default one instead.

Q79. What is the output of np.ndim([[1, 2], [3, 4]])?

A. 1
B. 2
C. 4
D. 3

Show Answer

Answer: B
np.ndim() returns the number of dimensions (axes) of the array, which is 2 for a matrix.

Q80. How do you calculate the standard deviation of a DataFrame column?

A. df[‘col’].std()
B. df[‘col’].var()
C. df[‘col’].deviation()
D. df[‘col’].stats()

Show Answer

Answer: A
std() calculates the sample standard deviation of the Series.

Q81. Which NumPy function returns the identity matrix?

A. np.eye()
B. np.identity()
C. np.ones()
D. Both A and B

Show Answer

Answer: D
Both np.eye() and np.identity() return a square array with ones on the diagonal and zeros elsewhere.

Q82. Which method splits a string column into multiple columns?

A. df[‘col’].split()
B. df[‘col’].str.split()
C. df.divide(‘col’)
D. df.cut(‘col’)

Show Answer

Answer: B
The .str accessor allows you to use split() to separate strings in a Series.

Q83. How do you select multiple columns ‘A’ and ‘B’ from a DataFrame?

A. df[‘A’, ‘B’]
B. df[[‘A’, ‘B’]]
C. df.loc[:, ‘A’:’B’]
D. Both B and C

Show Answer

Answer: D
You can pass a list of column names or use loc slicing depending on the structure.

Q84. Which function is used to bin values into discrete intervals?

A. pd.cut()
B. pd.qcut()
C. pd.bin()
D. pd.interval()

Show Answer

Answer: A
pd.cut() bins values into discrete intervals, useful for converting continuous data to categorical.

Q85. What is the result of df[‘A’].between(10, 20)?

A. Values between 10 and 20
B. Boolean Series indicating values between 10 and 20
C. Count of values in range
D. Error

Show Answer

Answer: B
between() returns a boolean vector equivalent to left <= series <= right.

Q86. How do you find the percentage change between current and prior element?

A. df.change()
B. df.pct_change()
C. df.diff()
D. df.shift()

Show Answer

Answer: B
pct_change() calculates the percentage change between the current and a prior element.

Q87. Which NumPy function flattens a multi-dimensional array?

A. arr.flatten()
B. arr.ravel()
C. arr.flat()
D. Both A and B

Show Answer

Answer: D
Both flatten() and ravel() return a flattened 1-D array, but ravel returns a view if possible.

Q88. How do you remove a row with index label ‘X’?

A. df.drop(‘X’, axis=0)
B. df.drop(‘X’)
C. df.del(‘X’)
D. df.remove(‘X’)

Show Answer

Answer: A
drop() with axis=0 (default) removes rows by label.

Q89. Which function is used to read data from the clipboard?

A. pd.read_clipboard()
B. pd.read_data()
C. pd.from_clipboard()
D. pd.import_clip()

Show Answer

Answer: A
read_clipboard() reads text from the clipboard and passes it to read_csv().

Q90. Which method creates a bar plot in Pandas?

A. df.plot.bar()
B. df.bar()
C. df.plot(kind=’bar’)
D. Both A and C

Show Answer

Answer: D
Both df.plot.bar() and df.plot(kind=’bar’) create a vertical bar plot.

Q91. What does df.diff() calculate?

A. Difference between consecutive elements
B. Difference between columns
C. Difference from mean
D. Differential equation solution

Show Answer

Answer: A
diff() calculates the first discrete difference of an element.

Q92. How do you replace all occurrences of a substring in a string column?

A. df[‘col’].str.replace()
B. df[‘col’].replace()
C. df[‘col’].str.sub()
D. df[‘col’].swap()

Show Answer

Answer: A
The .str.replace() method is specifically for replacing occurrences of pattern/regex in strings.

Q93. Which function returns a boolean array where values are Not a Number (NaN)?

A. pd.isna()
B. pd.notna()
C. pd.isnan()
D. pd.check_na()

Show Answer

Answer: B
notna() returns True for non-missing values, while isna() returns True for missing values.

Q94. How do you find the intersection of two Series?

A. pd.intersect()
B. s1.intersection(s2)
C. s1 & s2
D. Both B and C

Show Answer

Answer: D
You can use the intersection() method or the & operator on index/sets.

Q95. Which parameter in pd.read_csv() handles files with different delimiters?

A. sep
B. delimiter
C. parse
D. Both A and B

Show Answer

Answer: D
Both sep and delimiter parameters specify the delimiter to use for parsing the file.

Q96. Which NumPy function computes the standard deviation?

A. np.std()
B. np.var()
C. np.dev()
D. np.sqrt(np.var())

Show Answer

Answer: A
np.std() computes the standard deviation along the specified axis.

Q97. How do you categorize continuous data into quantiles?

A. pd.cut()
B. pd.qcut()
C. pd.bucket()
D. pd.range()

Show Answer

Answer: B
pd.qcut() is a quantile-based discretization function that creates bins of equal size.

Q98. Which method filters data based on query string?

A. df.filter()
B. df.query()
C. df.search()
D. df.find()

Show Answer

Answer: B
query() filters the DataFrame using a boolean expression string.

Q99. How can you remove leading and trailing whitespace from a string column?

A. df[‘col’].trim()
B. df[‘col’].str.strip()
C. df[‘col’].clean()
D. df[‘col’].remove_space()

Show Answer

Answer: B
str.strip() removes leading and trailing characters (whitespace by default) from strings.

Q100. Which method returns the number of dimensions of a DataFrame?

A. df.ndim
B. df.shape
C. df.size
D. df.dim

Show Answer

Answer: A
The ndim attribute returns an integer representing the number of array dimensions (2 for DataFrame).

Conclusion

That’s it for this 100 Python for Data Analysis MCQs question bank! I know, I know, it looks big and hard to remember everything, but you don’t need to memorise it all. Just bookmark this page and go through it twice a week. With consistent revision, you’ll naturally start remembering the concepts and understanding them deeply within a few weeks.

Now, if you want to go further in Python and strengthen your interview preparation, check out these curated resources:

100 Python Interview Questions
In case you are into Machine Learning check out Machine Learning Interview Questions

You can also explore questions focused on specific libraries and frameworks:

These resources will help you understand and prepare for real-world Python development and interviews.