mip_dmp.utils
The mip_dmp.utils
package contains modules with utility functions for i/o, logging, and script argument parsing.
mip_dmp.utils.io
Module for input/output operations with files involved in the MIP Dataset Mapper.
- mip_dmp.utils.io.generate_output_path(input_cdes_file: str, output_dir: str, output_suffix: str)[source]
Generate output path for CDEs file, but without any extension.
Parameters
- input_cdes_filestr
Path to input CDEs file in JSON or EXCEL format.
- output_dirstr
Path to directory where the output CDEs file will be written.
- output_suffixstr
Suffix to add to the input CDEs file name, to generate the output CDEs file name.
Returns
- out_cdes_fnamestr
Generated absolute path for the output CDEs files where the updated CDEs are written, with extension automatically added (.json for JSON, .xlsx for EXCEL).
- mip_dmp.utils.io.load_c2v_model(model_name='eng_50')[source]
Load a chars2vec model from disk.
Parameters
- model_namestr, optional
Name of the chars2vec model to load, by default “eng_50”
Returns
- dict
Dictionary containing the chars2vec model.
- mip_dmp.utils.io.load_csv(csc_file: str)[source]
Load content of a CSV file.
Parameters
- csv_filestr
Path to CSV file.
Returns
- datapd.DataFrame
Dataframe loaded from CSV file.
- mip_dmp.utils.io.load_excel(excel_file: str)[source]
Load content of an Excel file.
Parameters
- excel_filestr
Path to Excel file.
Returns
- datapd.DataFrame
Dataframe loaded from Excel file.
- mip_dmp.utils.io.load_glove_model(model_name='glove-wiki-gigaword-50')[source]
Load a GloVe model from disk.
Parameters
- model_namestr, optional
Name of the GloVe model to load, by default “glove-wiki-gigaword-50”
Returns
- dict
Dictionary containing the GloVe model.
mip_dmp.utils.logger
Module to setup logging for the MIP Dataset Mapper.
mip_dmp.utils.parser
Module to create argument parser of the script, i.e. command line interface of the MIP Dataset Mapper.