Dr Nic

Nicola Stokes

School of Computer Science and Informatics,

College of Engineering Mathematical & Physical Sciences,

University College Dublin,

Belfield, Dublin 4

Ireland.

[nicola.stokes at ucd.ie]

Research Interests | Publications | Activities

Research Interests

I am interested in the development of robust statistical and linguistic Text Analytics techniques (e.g. Lexical Cohesion Analysis, Textual Entailment and Paraphrase Identification, Temporal Expression Analysis) for use in National Language Processing and Information Retrieval applications such as Text Summarisation, Question Answering and Text Classification in many different application domains such as Newswire, and Biomedical and Clinical datasets e.g., scientific publications, patient records. 

 

Current Post and Academic Background

In recent years, I have been working as a Strategic Research Manager in the School of Computer Science and Informatics, UCD. [LinkedIn] [Google Scholar]

From 2005 to mid-2008, I was a postdoctoral research fellow in the National ICT Australia (NICTA) Victoria Lab at the University of Melbourne. During this time I worked with Prof. Steven Bird, Prof. Tim Baldwin, Dr Lawrence Cavedon, Prof. James Bailey, Prof. Alistair Moffat, Prof. Justin Zobel, and other members of the Interactive Information Discovery and Delivery project (I2D2), which was part of the Network Information Processing Program at the NICTA Victoria Lab. 

I completed my PhD in 2004 under the supervision of Prof. Joe Carthy at the Intelligent Information Retrieval group in the Department of Computer Science, University College Dublin. My thesis investigates the appropriateness of using lexical cohesion analysis (provided by lexical chains) to improve IR and NLP tasks in the Topic Detection and Tracking domain. During the course of this work I focussed on three separate tasks: New Event Detection (i.e. the detection of breaking news stories as they arrive on a news stream), News Story Segmentation (i.e. the identification of boundaries between adjacent news stories in a broadcast news programme transcript) and News Story Gisting (i.e. the generation of single-sentence news story summaries) for broadcast news and newswire data. After completing my Phd, and before moving to Melbourne, I held a 1-year postdoc position in UCD where I worked with the UCD Summarisation group.

In 2001 I spent a semester at the Center for Intelligent Information Retrieval (CIIR) at UMass working with Prof. James Allan and Prof. Victor Lavrenko on the New Event Detection task. From 2002-2003 my work on News Story Segmentation and Gisting was motivated by collaborative work with Prof. Alan Smeaton and his group on the Fischlar News Stories system at the Centre for Digital Video Processing, Dublin City University. 

 

Publications

Publication Statistics

I have a H-index of 15 and over 650 citations. I have published close to 50 publications in peer-reviewed journals, conferences and workshops. I am also the recipient of 3 best paper awards for my work. For an up-to-date list of publications and metrics you can visit my Google Scholar profile.

PhD Thesis 

Nicola Stokes. Applications of Lexical Cohesion Analysis in the Topic Detection and Tracking Domain. Department of Computer Science, University College Dublin, April 2004. [pdf]  [zip] 

Journal Papers

Martina Naughton, Nicola Stokes, Joe Carthy. Sentence-level Event Coreference Resolution. Under Review.

Bader Aljaber, David Martinez, Nicola Stokes, James Bailey. Improving MeSH Classification of Biomedical Articles using Citation Contexts. Journal of Biomedical Informatics, 44(5):881-896, 2011. [pdf]

Bader Aljaber, Nicola Stokes, James Bailey, Jian Pei. Document Clustering of Scientific Texts using Citation Contexts. Information Retrieval, 13(2):101-131, 2010. [pdf]

Martina Naughton, Nicola Stokes, Joe Carthy. Sentence-level Event Classification in Unstructured Texts. Information Retrieval, 13(2):132-156, 2010. [pdf]

Robin Boutros, Nicola Stokes, Micheal Bekaert, Emma C. Teeling. UniPrime2: a web service providing easier Universal Primer Design. In Nucleic Acids Research, 2009. [full-paper][UniPrime2]

Nicola Stokes, Yi Li, Lawrence Cavedon, Justin Zobel. Exploring criteria for successful query expansion in the Genomic domain. Information Retrieval, 12:17-50, 2009. [pdf]

Nicola Stokes, Yi Li, Alistair Moffat, Jiawen Rong. An empirical study of the effects of NLP components on Geographic IR performance. In the special issue on Geographic Information Retrieval, International Journal of Geographical Information Science, 22(3):247-264, 2008. [pdf]

Eamonn Newman, Joe Carthy, John Dunnion, Nicola Stokes. Identifying Semantic Equivalence for Multi-document Summarisation. Artifical Intelligence Review, 25(1-2):55-65, 2006. [pdf]

Nicola Stokes, Joe Carthy, Alan F. Smeaton. SeLeCT: A Lexical Cohesion based News Story Segmentation System. In the Journal of AI Communications, 17(1):3-12, 2004. [pdf]

Book Reviews

Nicola Stokes. William Hersh: Information retrieval: a health and biomedical perspective, 3rd ed - Book Review. Information Retrieval, published online June 2009.

Nicola Stokes. TREC: Experiment and Evaluation in Information Retrieval - Book Review. Computational Linguistics, Vol. 32, No. 4, pp. 563-567, 2006.

Conference and Workshop Papers

2013

Jesus O Iglesias, Philip Perry, Nicola Stokes, James Thorburn, Liam Murphy. A cost-capacity analysis for assessing the efficiency of heterogeneous computing assets in an enterprise cloud. In the Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing, 2013.

J Cogley, N Stokes, J Carthy. Exploring the effectiveness of medical entity recognition for clinical information retrieval. In the Proceedings of the 7th ACL International workshop on Data and text mining in Biomedical Informatics, 2013.

Jesus O. Iglesias, Nicola Stokes, Anthony Ventresque, Liam Murphy, James Thorburn. Towards the Automatic Detection of Efficient Computing Assets in a Heterogeneous Cloud Environment. In the Proceedings of the 2013 IEEE Sixth International Conference on Cloud Computing, 2013

Xi Li, Anthony Ventresque, Nicola Stokes, James Thorburn, John Murphy. iVMp: An Interactive VM Placement Algorithm for Agile Capital Allocation. In the Proceedings of the 2013 IEEE Sixth International Conference on Cloud Computing, 2013.

2012

James Cogley, Nicola Stokes, John Dunnion, Joe Carthy. UCD IIRG at TREC 2012 Medical Track. In the proceedings of TREC 2012.

James Cogley, Nicola Stokes, John Dunnion, Joe Carthy. Analyzing patient records to establish if and when a patient suffered from a medical condition. In the Proceedings of the ACL 2012 Workshop on Biomedical Natural Language Processing, 2012.

2008

Bader Aljaber, Nicola Stokes, James Bailey, Yi Li. Exploring the benefit of contextual information for boosting TREC Genomic IR performance.In the proceedings of the Australasian Document Computing Symposium (ADCS), 2008. [pdf]

Martina Naughton, Nicola Stokes, Joe Carthy. Investigation Statistical Techniques for Sentence-level Event Classification. In the proceedings of Coling 2008. [pdf]

2007

Nicola Stokes, Yi Li, Lawrence Cavedon, Justin Zobel. Exploring abbreviation expansion for Genomic Information Retrieval. In the proceedings of the Australasian Language Technology Workshop, 2007. [Best Paper Award] [pdf]

Nicola Stokes, Yi Li, Lawrence Cavedon, Eric Huang, Jiawen Rong, Justin Zobel. Entity-based relevance feedback for genomic list answer retrieval. In the proceedings of the TREC Genomics Track, 2007. [pdf]

Benjamin Goudey, Nicola Stokes, David Martinez. Exploring extensions to machine learning-based Gene Normalisation. In the Proceedings of the Australasian Language Technology Workshop, 2007. [pdf]

Yi Li, Nicola Stokes, Lawrence Cavedon, Alistair Moffat. NICTA I2D2 Group at GeoCLEF 2006. In Evaluation of Multilingual and Multi-modal Information Retrieval, LNCS, Springer, Vol. 4730/2007, pp. 938-945, 2007. [pdf]

Nicola Stokes, Jiawen Rong, Lawrence Cavedon. NICTA's Update and Question-based Summarisation Systems at DUC 2007. In the Proceedings of the Document Understanding Conference Workshop, 2007. [pdf]

2006

Yi Li, Nicola Stokes, Lawrence Cavedon, Alistair Moffat. NICTA I2D2 Group at GeoCLEF 2006.In the Proceedings of the GeoCLEF Workshop on Geo-Spatial IR, Alicante, Spain, 2006. [pdf]

Yi Li, Alistair Moffat, Nicola Stokes, Lawrence Cavedon. Exploring probabilistic toponym resolution for geographical information retrieval. In the Proceedings of SIGIR Workshop on Geographical Information Retrieval, pages 17--22, 2006. [pdf]

Jeremy Nicholson, Nicola Stokes, Tim Baldwin. Detecting entailment using an extended implementation of the basic elements overlap metric. In the Proceedings of the Second Pascal Recognising Textual Entailment Challenge (RTE2), Venice, pp. 122-7, 2006. [pdf]

Eamonn Newman, Nicola Stokes, Joe Carthy, John Dunnion. Textual Entailment Recognition Using a Linguistically-Motivated Decision Tree Classifier. Machine Learning Challenges (First PASCAL Machine Learning Challenges Workshop, MLCW 2005, Revised Selected Papers), Lecture Notes in Computer Science, Springer, pp. 372-384, 2006. [pdf]

2005

Nicola Stokes, Eamonn Newman. Multi-document Summarisation and the PASCAL Textual Entailment Challenge.In the proceedings of the Australasian Language Technology Workshop 2005 (ALTW 2005), Australasian Language Technology Association. pp. 215-223, 2005. [pdf]

Eamonn Newman, Nicola Stokes, Joe Carthy, John Dunnion. UCD IIRG Approach to the Textual Entailment Challenge. In the Proceedings of the PASCAL Recognising Textual Entailment Challenge, April 2005. [pdf]

Ruichao Wang, Nicola Stokes, William Doran, Eamonn Newman, Joe Carthy, John Dunnion. Comparing Topiary-style approaches to Headline Generation. In the Proceedings of the 27th European Conference on Information Retrieval (ECIR-05), Santiago de Compstela, Spain, March 2005. [pdf]

Ruichao Wang, Nicola Stokes, William Doran, Eamonn Newman, Joe Carthy, John Dunnion. News Headline Generation Based on Linguistic Methods. In the Proceedings of the IASTED International Conference on Artificial Intelligence and Applications (AIA 2005), Innsbruck, Austria, January 2005.

Ruichao Wang, Nicola Stokes, William Doran, Eamonn Newman, John Dunnion, Joe Carthy. LexTrim: A Lexical Cohesion based Approach to Parse-and-Trim Style Headline Generation. In the Proceedings of the 6th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2005), Mexico City, January 2005. [Best Poster Award] [pdf]

2004

Eamonn Newman, William Doran, Nicola Stokes, Joe Carthy, John Dunnion. Examination of Similarity Metrics for Redundancy Removal in Multi-Document Summarisation. In the Proceedings of the. 15th AICS Conference, pp. 292 - 301, Castlebar, Co. Mayo, September, 2004.

Eamonn Newman, William Doran, Nicola Stokes, Joe Carthy, John Dunnion. Comparing Redundancy Removal Techniques for Multi-document Summarisation. In the Proceedings of STAIRS, pp. 223-228, August 2004. [pdf]

William P. Doran, Nicola Stokes, Eamonn Newman, John Dunnion, Joe Carthy. A Hybrid Statistical/Linguistic approach to News Story Gisting. In the Proceedings of the 27th ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 464-465, July 2004. [pdf]

William P. Doran, Nicola Stokes, Eamonn Newman, John Dunnion, Joe Carthy, Fergus Toolan. News Story Gisting at University College Dublin. In the Proceedings of the Document Understanding Conference (DUC), 2004.  [pdf]

Nicola Stokes, Eamonn Newman, Joe Carthy, Alan F. Smeaton. Broadcast News Gisting using Lexical Cohesion Analysis. In the Proceedings of the 26th European Conference on Information Retrieval (ECIR-04), pp. 209-222, Sunderland, U.K., 2004. [pdf]

William P. Doran, Nicola Stokes, John Dunnion, Joe Carthy. Assessing the Impact of Lexical Chain Scoring Methods and Sentence Extraction Schemes on Summarization. In the Proceedings of the 5th International conference on Intelligent Text Processing and Computational Linguistics CICLing-2004, 2004. [pdf]

William P. Doran, Nicola Stokes, John Dunnion, Joe Carthy. Comparing Lexical Chain-based Summarisation Approaches using an Extrinsic Evaluation. In the Proceedings of the Global WordNet Conference(GWC 2004), 2004. [pdf]

2003

Nicola Stokes. Spoken and Written News Story Segmentation using Lexical Chaining. In the Proceedings of the Student Workshop at HLT-NAACL 2003, Companion Volume, pp. 49-54, Edmonton, Canada, 2003. [pdf]

2002

Nicola Stokes, Joe Carthy, Alan F. Smeaton. Segmenting Broadcast News Streams using Lexical Chaining. In the Proceedings of STAIRS 2002, Vol.1, IOS Press, Ed. T. Vidal and P. Liberatore, pp. 145-154. Lyons, France, 2002. [Best Paper Award] See journal paper (Stokes et al., 2004).

2001

Nicola Stokes, Joe Carthy. Combining Semantic and Syntactic Document Classifiers to Improve First Story Detection. In the Proceedings of the 24th Annual ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 424-425, 2001. [pdf]

Nicola Stokes, Joe Carthy. Using Data Fusion to Improve First Story Detection. In the Proceedings of the 23rd BCS-IRSG European Colloquium IR Research, pp. 78-90, 2001.

Nicola Stokes, Joe Carthy. First Story Detection using a Composite Document Representation. In the Proceedings of HLT 2001, Human Language Technology Conference, pp. 134-141, 2001. [pdf]

2000

Nicola Stokes, Paula Hatch, Joe Carthy. Lexical Chaining for Web-Based Retrieval of Breaking News. In the Proceedings of the International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems AH2000, pp. 327-330, 2000. 

Nicola Stokes, Paula Hatch, Joe Carthy. Lexical Semantic Relatedness and Online News Event Detection. In the Proceedings of the 23rd Annual ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 324-325, 2000.  [pdf]

Nicola Stokes, Paula Hatch, Joe Carthy. Topic Detection, a new application for lexical chaining?  In the Proceedings of the 22nd BCS IRSG Colloquium, pp. 94-103, 2000. [pdf]

 

Awards

  • Best Paper Award at the Australasian Language Technology Workshop, 2007.

  • Best Poster Award at the International Conference on Intelligent Text Processing and Computational Linguistics (CICLing), Mexico City, 2005.

  • Presenter at Science Uncovered 2005 at UCD 150 celebrations. [Press Coverage]

  • Best Paper Award at the Starting AI Researchers Symposium (STAIRS 2002), Lyon, France.

Activities

Organising Committee Member

Programme Committee Member

Journal Reviewer
  • Journal of Information Retrieval

  • Transactions on Information Systems

  • ACM TSLP Transactions on Speech and Language Processing

  • Journal of Language Resources and Evaluation

  

                                                                                                                      Last updated  01/06/14