Skip to content

Personal addition to pandas data ETL for faster and better performance

Notifications You must be signed in to change notification settings

bigmb/mb_pandas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mb_pandas

forthebadge

GitHub issuesMaintenance

Hits

Personal addition to pandas package for faster and better performance of data wrangling.

Main functions:

1) pandas profile (from mb_pandas.src.profiler import create_profile)

2) df load using async for larger datasets (from mb_pandas.src.dfload import load_any_df)

3) df compare (from mb_pandas.src.profiler import profile_compare)

4) df transformation for basic function (from mb_pandas.src.tranform import *)
['check_null','remove_unnamed','rename_columns','check_drop_duplicates','get_dftype'])

Scripts:

1) df_profile - to create profile for any df. (default folder : /home/malav/pandas_profiles/)
2) df_view - to view the csv or parquet file

Pip install :

pip install mb_pandas

Load all the other packages with mb_base

pip install mb_base
import mb.pandas as pd 

About

Personal addition to pandas data ETL for faster and better performance

Resources

Stars

Watchers

Forks

Packages

No packages published