Saturday, September 25, 2021

DataFrame in Python Class - 1

 DataFrame:

1. DataFrame is a 2 D Data Structure.

2. DataFrame is a 2 D Structure of Python Pandas Library.

3. DataFrame is a Heterogeneous Data Structure

4. It just like a table or spreadsheet.

5. It can contains 2 or more rows and columns.

6. Types of Columns can be different.

7. Size of DataFrame is Mutable.

8. Value of DataFrame also Mutable.

9. Arithmetic Operators can be performed on rows and columns.

10. It can store different types of values.

Example:

import pandas as pd

I=['A','B','C','D','E']

D={"2019":[50,60,70,80,90],"2020":[40,35,45,55,32]}

DF=pd.DataFrame(D,I)

print(DF)


Output:

   2019  2020
A    50    40
B    60    35
C    70    45
D    80    55
E    90    32


 DataFrame Attributes / Properties:

DataFrame has the following attributes:

1. index
2. columns
3. axes
4. dtypes
5. size
6. shape
7. ndim
8. empty
9. count
10. T


1. index

It display the index of the DataFrame.

import pandas as pd
I=['IP','Bio','Chemistry','Physics','English']
D={"2018":[50,60,70,80,90],"2019":[40,35,45,55,32], \
   "2020":[65,75,85,45,52],}
df=pd.DataFrame(D,I)
df.index.name="Subject"
print(df)
print("Index of Data Frame")
print(df.index)

Output:

           2018  2019  2020
Subject                    
IP           50    40    65
Bio          60    35    75
Chemistry    70    45    85
Physics      80    55    45
English      90    32    52
Index of Data Frame
Index(['IP', 'Bio', 'Chemistry', 'Physics', 'English'], dtype='object', name='Subject')


2. columns

It display the name of columns of DataFrame

import pandas as pd
I=['IP','Bio','Chemistry','Physics','English']
D={"2018":[50,60,70,80,90],"2019":[40,35,45,55,32], \
   "2020":[65,75,85,45,52],}
df=pd.DataFrame(D,I)
df.index.name="Subject"
print(df)
print("Columns of Data Frame")
print(df.columns)


Output:

           2018  2019  2020
Subject                    
IP           50    40    65
Bio          60    35    75
Chemistry    70    45    85
Physics      80    55    45
English      90    32    52
Columns of Data Frame
Index(['2018', '2019', '2020'], dtype='object')



3. Axes

It display both Index name and column name of DataFrame

import pandas as pd
I=['IP','Bio','Chemistry','Physics','English']
D={"2018":[50,60,70,80,90],"2019":[40,35,45,55,32], \
   "2020":[65,75,85,45,52],}
df=pd.DataFrame(D,I)
df.index.name="Subject"
print(df)
print("Axes of Data Frame")
print(df.axes)


Output:

          2018  2019  2020
Subject                    
IP           50    40    65
Bio          60    35    75
Chemistry    70    45    85
Physics      80    55    45
English      90    32    52
Axes of Data Frame
[Index(['IP', 'Bio', 'Chemistry', 'Physics', 'English'], dtype='object', name='Subject'), Index(['2018', '2019', '2020'], dtype='object')]


4. dtype:

Display the data type of columns/Values

import pandas as pd
I=['IP','Bio','Chemistry','Physics','English']
D={"2018":[50,60,70,80.5,90],"2019":[40,35,45,55,32], \
   "2020":[65,75,85,45,52],}
df=pd.DataFrame(D,I)
df.index.name="Subject"
print(df)
print("Data Type of Data Frame")
print(df.dtypes)



Output:


           2018  2019  2020
Subject                    
IP         50.0    40    65
Bio        60.0    35    75
Chemistry  70.0    45    85
Physics    80.5    55    45
English    90.0    32    52
Data Type of Data Frame
2018    float64
2019      int64
2020      int64
dtype: object

5.size

Display the size of DataFrame i.e. total number of elements.

import pandas as pd
I=['IP','Bio','Chemistry','Physics','English']
D={"2018":[50,60,70,80.5,90],"2019":[40,35,45,55,32], \
   "2020":[65,75,85,45,52],}
df=pd.DataFrame(D,I)
df.index.name="Subject"
print(df)
print("size of Data Frame")
print(df.size)

Output:


           2018  2019  2020
Subject                    
IP         50.0    40    65
Bio        60.0    35    75
Chemistry  70.0    45    85
Physics    80.5    55    45
English    90.0    32    52
size of Data Frame
15

6.shape

Display number  of rows and columns
import pandas as pd
I=['IP','Bio','Chemistry','Physics','English']
D={"2018":[50,60,70,80.5,90],"2019":[40,35,45,55,32], \
   "2020":[65,75,85,45,52],}
df=pd.DataFrame(D,I)
df.index.name="Subject"
print(df)
print("Shape of Data Frame")
print(df.shape)

Output:

           2018  2019  2020
Subject                    
IP         50.0    40    65
Bio        60.0    35    75
Chemistry  70.0    45    85
Physics    80.5    55    45
English    90.0    32    52
Shape of Data Frame
(5, 3)

No comments:

Post a Comment