mip_dmp.plot.embedding module
Module to plot the embeddings of the column names and CDE codes.
- mip_dmp.plot.embedding.scatterplot_embeddings(fig: Figure, embeddings: dict, matchedCdeCodes: dict, selectedColumnName: str)[source]
Plot the embeddings of the selected column name and CDE codes in a 3D scatter plot.
- fig: matplotlib.figure.Figure
Figure to render the 3D scatter plot of the embeddings.
- embeddings: dict
Dictionary of embeddings in the form:
{ "x": [5, ..., 2], "y": [0.5, ..., 0.2], "z": [0.5, ..., 0.2], "label": ["word1", ..., "wordN"], "type": ["cde", ..., "column"] }
where
x
,y
andz
are the lists of the x, y and z coordinates of the embeddings,label
is the list of the labels of the embeddings andtype
is the list of the types of the embeddings (can be “column” or “cde”).- matchedCdeCodes: dict
Dictionary of the matched CDE codes in the form:
{ "input_dataset_column1": { "words": ["cde_code1", "cde_code2", ...], "embeddings": [embedding_vector1, embedding_vector2, ...] "distances": [distance1, distance2, ...] }, "input_dataset_column2": { "words": ["cde_code1", "cde_code2", ...], "embeddings": [embedding_vector1, embedding_vector2, ...] "distances": [distance1, distance2, ...] }, ... }
- selectedColumnName: str
Name of the selected column.