Mon–Fri 10:00–17:00 IST
IJMEM Logo
International Journal of Modern Engineering and Management | IJMEM
Multidisciplinary
Open Access Journal
ISSN No: 3048-8230
Follows UGC–CARE Guidelines
Home Scope Indexing Publication Charges Archives Editorial Board Downloads Contact Us

Bidirectional Encoder Representation Model with Centrality-Weighting for Enhanced Sequence Labeling in Key Phrase Extraction from Scientific Texts

Author(s):

Kwame T. Mensah¹, Aisha L. Abubakar², and Ndidi O. Okafor³

Affiliation: 1,2,3Department of Computer Science � University of Ghana, Accra, Ghana � Ahmadu Bello University, Zaria, Nigeria � University of Lagos, Lagos, Nigeria

Page No: 1-10-

Volume issue & Publishing Year: Volume 1 Issue 7 ,Dec-2024

Journal: International Journal of Modern Engineering and Management | IJMEM

ISSN NO: 3048-8230

DOI:

Abstract:

Deep learning methods, particularly those leveraging Bidirectional Encoder Representations from Transformers (BERT) with advanced fine-tuning techniques, have demonstrated state-of-the-art performance in term extraction from text. However, BERT's focus on semantic context within localized text limits its ability to assess the overall relevance of tokens to the entire document. Existing approaches applying sequence labeling on contextualized embeddings also predominantly rely on local context, often neglecting document-level significance.To address these challenges, this study introduces CenBERT-SEQ, a centrality-weighted BERT-based model for keyphrase extraction using sequence labeling. CenBERT-SEQ combines BERT's contextual embedding capabilities with a novel centrality-weighting layer that integrates document-level embeddings to emphasize the relevance of terms within the document context. A final linear classifier layer captures dependencies between outputs, enhancing overall model accuracy. The proposed model was evaluated against the BERT base-uncased model using three benchmark datasets from Computer Science articles: SemEval-2010, WWW, and KDD. Results indicate that CenBERT-SEQ surpasses the BERT-base model in precision, recall, and F1-score. Compared to related studies, CenBERT-SEQ demonstrated superior performance, achieving an accuracy of 95%, precision of 97%, recall of 91%, and an F1-score of 94%. These findings underscore CenBERT-SEQ's effectiveness in extracting keyphrases from scientific documents and its advancement over existing methodologies

Keywords:

Term Extraction, Bert, Sequence Labeling, Centrality-Weighted, Scientific Articles

Reference:

  • [1] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Proc. NAACL-HLT, pp. 4171–4186, 2019.
    [2] J. Hinton, O. Vinyals, and J. Dean, “Distilling the Knowledge in a Neural Network,” NeurIPS Deep Learning and Representation Learning Workshop, 2015.
    [3] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” J. Mach. Learn. Res., 2014.
    [4] M. Honnibal and I. Montani, “spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks, and incremental parsing,” 2017.
    [5] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, 1997.
    [6] X. Han et al., “OpenKE: An Open Toolkit for Knowledge Embedding,” Proc. EMNLP, 2018.
    [7] L. Yu, J. Wang, Y. Shen, and H. Jin, “SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient,” Proc. AAAI, 2018.
    [8] Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, “ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations,” Proc. ICLR, 2020.
    [9] G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer, “Neural Architectures for Named Entity Recognition,” Proc. NAACL, 2016.
    [10] Q. Chen, X. Zhu, Z. Ling, S. Wei, H. Jiang, and D. Inkpen, “Enhanced LSTM for Natural Language Inference,” Proc. ACL, 2017.
    [11] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” Proc. ICLR, 2015.
    [12] Z. Yang et al., “XLNet: Generalized Autoregressive Pretraining for Language Understanding,” Proc. NeurIPS, 2019.
    [13] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” ICLR Workshop, 2013.
    [14] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed Representations of Words and Phrases and Their Compositionality,” Proc. NeurIPS, 2013.
    [15] Y. Liu et al., “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” arXiv preprint arXiv:1907.11692, 2019.
    [16] P. Liu, X. Qiu, and X. Huang, “Adversarial Multi-task Learning for Text Classification,” Proc. ACL, 2017.
    [17] X. Liu, P. He, W. Chen, and J. Gao, “Multi-task Deep Neural Networks for Natural Language Understanding,” Proc. ACL, 2019.
    [18] Z. Zhang, X. Han, Z. Liu, X. Jiang, M. Sun, and Q. Liu, “ERNIE: Enhanced Representation through Knowledge Integration,” Proc. ACL, 2019.
    [19] Y. Sun, X. Qiu, Y. Xu, and X. Huang, “How to Fine-Tune BERT for Text Classification?” Proc. CCL, 2019.
    [20] R. Mihalcea and P. Tarau, “TextRank: Bringing Order into Text,” Proc. EMNLP, 2004.
    [21] C. Raffel et al., “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,” J. Mach. Learn. Res., 2020.
    [22] M. Peters et al., “Deep Contextualized Word Representations,” Proc. NAACL, 2018.
    [23] A. Vaswani et al., “Attention Is All You Need,” Proc. NeurIPS, 2017.
    [24] S. Ruder, “Neural Transfer Learning for Natural Language Processing,” Ph.D. dissertation, 2019.
    [25] Y. Song, S. Shi, J. Li, and H. Zhang, “Leveraging Contextualized Embeddings for Named Entity Recognition in Open-Domain Dialogues,” Proc. ACL, 2020.
    [26] E. F. Tjong Kim Sang and F. De Meulder, “Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition,” Proc. CoNLL, 2003.
    [27] Y. Wu et al., “Google’s Neural Machine Translation System: Bridging the Gap Between Human and Machine Translation,” arXiv preprint arXiv:1609.08144, 2016.
    [28] C. Yao, C. Mao, and Y. Luo, “Graph Representation Learning for Text Mining,” arXiv preprint arXiv:1906.06286, 2019.
    [29] Y. Zhang and Q. Yang, “A Survey on Multi-Task Learning,” IEEE Trans. Knowl. Data Eng., 2021.
    [30] S. Jain and B. C. Wallace, “Attention Is Not Explanation,” Proc. NAACL, 2019.
    [31] Y. Zhang, L. Shang, and W. Jia, “Hierarchical Attention Networks for Document Classification,” Proc. NAACL, 2020.
    [32] R. Collobert et al., “Natural Language Processing (Almost) from Scratch,” J. Mach. Learn. Res., 2011.
    [33] D. Bahdanau, K. Cho, and Y. Bengio, “Neural Machine Translation by Jointly Learning to Align and Translate,” Proc. ICLR, 2015.
    [34] T. Brown et al., “Language Models Are Few-Shot Learners,” Proc. NeurIPS, 2020.
    [35] M. Wu, M. Schuster, Z. Chen, et al., “Google’s Neural Machine Translation System: Bridging the Gap Between Human and Machine Translation,” arXiv preprint, 2016.

Download PDF