A data-driven text mining and semantic network analysis for design information retrieval

File Description SizeFormat 
Shi-F-2018-PhD-Thesis.pdfThesis5.71 MBAdobe PDFView/Open
Title: A data-driven text mining and semantic network analysis for design information retrieval
Authors: Shi, Feng
Item Type: Thesis or dissertation
Abstract: Data-Driven Design is an emerging area with the advent of big-data tools. Massive information stored in electronic and digital forms on the internet provides potential opportunities for knowledge discovery in the fields of design and engineering. The aim of the research reported in this thesis is to facilitate the design information retrieval process based on large-scale electronic data through the use of text mining and semantic network techniques. We have proposed a data-driven pipeline for design information retrieval including four elements, from data acquisition, text mining, semantic network analysis, to data visualisation and user interaction. Web crawling techniques are applied to fetch massive online textual data in data acquisition process. The use of text mining enables the transformation of data from unstructured raw texts into a structured semantic network. A retrieval analysis framework is proposed based on the constructed semantic network to retrieve relevant design information and provoke design innovation. Finally, a web-based platform B-Link has been developed to enable user to visualise the semantic network and interact with it through the proposed retrieval analysis framework. Seven case studies were conducted throughout the thesis to investigate the effectiveness and gain insights for each element of the pipeline. Thousands of design post news items and millions of engineering and design peer reviewed papers can be efficiently captured by web crawling techniques. Through the use of itemset mining and noun phrase chunking, a semantic network constructed based on these textual data is shown to capture more inherent design- and engineering-oriented concepts and relations, compared to the benchmarking approaches: WordNet, ConceptNet, NeLL and Wikipedia. A retrieval analysis framework has been developed with different retrieval behaviours to retrieve either common general or domain-specific concepts, explicit or implicit knowledge relations, which are found to satisfy various knowledge demands in our real design projects at the conceptual stage. Finally, the result of a user test is shown to be consistent with these findings.
Content Version: Open Access
Issue Date: Aug-2018
Date Awarded: Dec-2018
URI: http://hdl.handle.net/10044/1/66182
Copyright Statement: Creative Commons Attribution NonCommercial NoDerivatives Licence
Supervisor: Childs, Peter
Aurisicchio, Marco
Sponsor/Funder: China Scholarship Council
Department: Dyson School of Design Engineering
Publisher: Imperial College London
Qualification Level: Doctoral
Qualification Name: Doctor of Philosophy (PhD)
Appears in Collections:Design Engineering PhD theses



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commons