DB package#

Submodules#

DB.DatabaseSetup module#

academic_metrics.DB.DatabaseSetup.CollectionData#

Type alias representing a collection of documents from MongoDB.

Each document is represented as a dictionary with string keys and arbitrary values.

Type:

List[Dict[str, Any]]: A list of dictionaries where each dictionary represents a MongoDB document.

alias of List[Dict[str, Any]]

academic_metrics.DB.DatabaseSetup.DatabaseSnapshot#

Type alias representing a snapshot of all collections in the database.

Contains data from articles, categories, and faculty collections in that order.

Type:
Tuple[CollectionData, CollectionData, CollectionData]: A tuple containing:
  • article_data (CollectionData): Documents from the articles collection

  • category_data (CollectionData): Documents from the categories collection

  • faculty_data (CollectionData): Documents from the faculty collection

alias of Tuple[List[Dict[str, Any]], List[Dict[str, Any]], List[Dict[str, Any]]]

class academic_metrics.DB.DatabaseSetup.DatabaseWrapper(*, db_name, mongo_uri)[source]#

Bases: object

A wrapper class for MongoDB operations.

logger#

Logger for logging messages.

Type:

logging.Logger

client#

MongoDB client.

Type:

MongoClient

db#

MongoDB database.

Type:

Database

article_collection#

MongoDB collection for article data.

Type:

Collection

category_collection#

MongoDB collection for category data.

Type:

Collection

faculty_collection#

MongoDB collection for faculty data.

Type:

Collection

_test_connection()[source]#

Test the connection to the MongoDB server.

get_dois()[source]#

Get all DOIs from the article collection.

get_all_data()[source]#

Get all data from the article, category, and faculty collections.

insert_categories()[source]#

Insert multiple categories into the collection.

update_category()[source]#

Update an existing category.

insert_articles()[source]#

Insert multiple articles into the collection.

insert_faculty()[source]#

Insert multiple faculty entries into the collection.

update_faculty()[source]#

Update an existing faculty member.

process()[source]#

Process data and insert it into the appropriate collection.

run_all_process()[source]#

Run the process method for all collections.

clear_collection()[source]#

Clear the entire collection.

close_connection()[source]#

Close the connection to the MongoDB server.

__init__(*, db_name, mongo_uri)[source]#

Initialize the DatabaseWrapper with database name, collection name, and MongoDB URL.

Parameters:
  • db_name (str) – Name of the database.

  • mongo_uri (str) – MongoDB URI.

_test_connection()[source]#

Test the connection to the MongoDB server.

get_dois()[source]#

Get all DOIs from the article collection.

Returns:

List of DOIs.

Return type:

doi_list (List[str])

get_all_data()[source]#

Get all data from the article, category, and faculty collections.

Returns:

A tuple containing: - articles (CollectionData): Documents from the articles collection

  • categories (CollectionData): Documents from the categories collection

  • faculty (CollectionData): Documents from the faculty collection

Return type:

Tuple[CollectionData, CollectionData, CollectionData]

insert_categories(category_data)[source]#

Insert multiple categories into the collection.

If a category already exists, add the numbers and extend the lists.

Parameters:

category_data (List[Dict[str, Any]]) – List of category data.

update_category(existing_data, new_data)[source]#

Update existing category data with new data, handling None values and logging state.

Parameters:
  • existing_data (dict[str, Any]) – Existing category data.

  • new_data (dict[str, Any]) – New category data.

Returns:

Updated category data.

Return type:

existing_data (Dict[str, Any])

insert_articles(article_data)[source]#

Insert multiple articles into the collection.

If an article already exists, merge the new data with the existing data.

Parameters:

article_data (List[Dict[str, Any]]) – List of article data.

insert_faculty(faculty_data)[source]#

Insert multiple faculty entries into the collection.

If a faculty member already exists, update the data accordingly.

Parameters:

faculty_data (List[Dict[str, Any]]) – List of faculty data.

update_faculty(existing_data, new_data)[source]#

Update existing faculty data with new data, handling None values and logging state.

Parameters:
  • existing_data (Dict[str, Any]) – Existing faculty data.

  • new_data (Dict[str, Any]) – New faculty data.

Returns:

Updated faculty data.

Return type:

existing_data (Dict[str, Any])

process(data, collection)[source]#

Process data and insert it into the appropriate collection.

Parameters:
  • data (List[Dict[str, Any]]) – Data to be inserted.

  • collection (str) – Name of the collection to insert the data into.

run_all_process(category_data, article_data, faculty_data)[source]#

Process all data and insert it into the appropriate collections.

Parameters:
  • category_data (List[Dict[str, Any]]) – Category data.

  • article_data (List[Dict[str, Any]]) – Article data.

  • faculty_data (List[Dict[str, Any]]) – Faculty data.

fix_counts()[source]#
clear_collection()[source]#

Clear the entire collection.

close_connection()[source]#

Close the connection to the MongoDB server.

__dict__ = mappingproxy({'__module__': 'academic_metrics.DB.DatabaseSetup', '__doc__': 'A wrapper class for MongoDB operations.\n\n    Attributes:\n        logger (logging.Logger): Logger for logging messages.\n        client (MongoClient): MongoDB client.\n        db (Database): MongoDB database.\n        article_collection (Collection): MongoDB collection for article data.\n        category_collection (Collection): MongoDB collection for category data.\n        faculty_collection (Collection): MongoDB collection for faculty data.\n\n    Methods:\n        _test_connection: Test the connection to the MongoDB server.\n        get_dois: Get all DOIs from the article collection.\n        get_all_data: Get all data from the article, category, and faculty collections.\n        insert_categories: Insert multiple categories into the collection.\n        update_category: Update an existing category.\n        insert_articles: Insert multiple articles into the collection.\n        insert_faculty: Insert multiple faculty entries into the collection.\n        update_faculty: Update an existing faculty member.\n        process: Process data and insert it into the appropriate collection.\n        run_all_process: Run the process method for all collections.\n        clear_collection: Clear the entire collection.\n        close_connection: Close the connection to the MongoDB server.\n    ', '__init__': <function DatabaseWrapper.__init__>, '_test_connection': <function DatabaseWrapper._test_connection>, 'get_dois': <function DatabaseWrapper.get_dois>, 'get_all_data': <function DatabaseWrapper.get_all_data>, 'insert_categories': <function DatabaseWrapper.insert_categories>, 'update_category': <function DatabaseWrapper.update_category>, 'insert_articles': <function DatabaseWrapper.insert_articles>, 'insert_faculty': <function DatabaseWrapper.insert_faculty>, 'update_faculty': <function DatabaseWrapper.update_faculty>, 'process': <function DatabaseWrapper.process>, 'run_all_process': <function DatabaseWrapper.run_all_process>, 'fix_counts': <function DatabaseWrapper.fix_counts>, 'clear_collection': <function DatabaseWrapper.clear_collection>, 'close_connection': <function DatabaseWrapper.close_connection>, '__dict__': <attribute '__dict__' of 'DatabaseWrapper' objects>, '__weakref__': <attribute '__weakref__' of 'DatabaseWrapper' objects>, '__annotations__': {'article_collection': 'Collection', 'category_collection': 'Collection', 'faculty_collection': 'Collection'}})#
__module__ = 'academic_metrics.DB.DatabaseSetup'#
__weakref__#

list of weak references to the object

Module contents#