mip_dmp.qt5.components.embedding_visualization_widget module
Module that defines the class dedicated to the widget that supports the visualization of the initial automated mapping matches via embedding.
- class mip_dmp.qt5.components.embedding_visualization_widget.WordEmbeddingVisualizationWidget(parent=None)[source]
Bases:
QWidget
Class for the widget that supports the visualization of the automated column / CDE code matches via embedding.
- adjustWindow()[source]
Adjust the window size, Qt Style Sheet, and title.
Parameters
- mainWindowQMainWindow
The main window of the application.
- generate_embedding_figure()[source]
Generate 3D scatter plot showing dimensionality-reduced embedding vectors of the words.
- generate_embeddings(inputDatasetColumns: list, targetCDECodes: list, matchingMethod: str)[source]
Generate the embeddings of the columns and CDE codes.
Set the input dataset columns (
self.inputDatasetColumns
), the target CDE codes (self.targetCDECodes
), the input dataset column embeddings (self.inputDatasetColumnEmbeddings
) and the target CDE code embeddings (self.targetCDECodeEmbeddings
).The embeddings are generated using the specified matching method (
matchingMethod
). The matching method can be “glove” or “chars2vec”.- inputDatasetColumns: list
List of the input dataset columns.
- targetCDECodes: list
List of the target CDE codes.
- matchingMethod: str
Matching method. Can be “glove” or “chars2vec”
- set_embeddings(inputDatasetColumnEmbeddings: list, inputDatasetColumns: list, targetCDECodeEmbeddings: list, targetCDECodes: list, matchedCdeCodes: dict, matchingMethod: str)[source]
Set the input dataset column and target CDE code embeddings.
- inputDatasetColumnEmbeddings: list
List of the input dataset column embeddings.
- inputDatasetColumns: list
List of the input dataset columns.
- targetCDECodeEmbeddings: list
List of the target CDE code embeddings.
- targetCDECodes: list
List of the target CDE codes.
- matchedCdeCodes: dict
Dictionary of the matched CDE codes in the form:
{ "input_dataset_column1": { "words": ["cde_code1", "cde_code2", ...], "embeddings": [embedding_vector1, embedding_vector2, ...] "distances": [distance1, distance2, ...] }, "input_dataset_column2": { "words": ["cde_code1", "cde_code2", ...], "embeddings": [embedding_vector1, embedding_vector2, ...] "distances": [distance1, distance2, ...] }, ... }
- matchingMethod: str
Matching method. Can be “glove” or “chars2vec”.
- set_matching_method(matchingMethod)[source]
Set the matching method.
- matchingMethod: str
Matching method. Can be “glove” or “chars2vec”
- set_word_list(wordList)[source]
Set the list of words that can be visualized in the 3D scatter plot.
- wordList: list
List of words to visualize in the 3D scatter plot
- set_wordcombobox_items(wordList)[source]
Set the items of the word combo box.
- wordList: list
List of words to visualize in the combo box of the widget that controls the selection of the word to visualize in the 3D scatter plot.
- staticMetaObject = <PySide2.QtCore.QMetaObject object>