Pandas is an open-source Python Library providing high-performance data manipulation and analysis tool using its powerful data structures. The name Pandas is derived from the word Panel Data – an Econometrics from Multidimensional data.
A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns.
For the purpose of this tutorial we will be using the pokemon dataset. You can download it for your own purposes here
Since this notebook is hosted on google colab, I am using my google drive to load my dataset.
to read the csv instead of
1 2 3
import csv with open('employee_birthday.txt') as csv_file: csv_reader = csv.reader(csv_file, delimiter=',')
We will just use the pandas read_csv function. This will directly load the csv into a dataframe. Other such functions include read_json,read_sql,_read_html, etc.
If you want to get the value count for each speed use the value_counts() function
As you can see the speed value 50 is the most common among pokemons
We can do the same to find the pokemon with the greatest attack power which is mewtwo X
Now this is a list of True and False values telling us the rows which are legendary and which are not. To get the actual dataframe for jsut the legendary pokemon.
Let's try the filter again after the replacement