Abstract
Building a recipe knowledge graph from the perspective of user knowledge demands can provide users with accurate and efficient retrieval results and recipes. Relatively a few ontology studies have been conducted on Chinese recipes. In addition, some existing recipe ontology models are relatively rough, with few dimensions, and others have too many dimensions, that is, they lack versatility and cannot achieve efficient recipe recommendation and query functions. Therefore, this article proposes a general recipe ontology based on user knowledge demands analysed from multi-source recipes. Then, this article selects recipes for common edible flowers with health benefits, an ontology model and constructs a knowledge graph of flower recipes via knowledge extraction, knowledge fusion and other techniques. Finally, the application of constructed flower recipe knowledge graph is discussed, and the results indicated that it can effectively feedback the query results and satisfy the knowledge demands of users.
1. Introduction
In the context of Big Data, quick and efficient access to knowledge resources has become a focus of attention. An increasing number of indispensable intelligent products and services are flooding people’s daily lives, helping them gain useful knowledge. However, the exponential growth of data inevitably causes information overload and asymmetry problems [1]. Research on transforming information resources into knowledge based on computer technology and transferring it to users or forming a knowledge base has become increasingly important to help users use knowledge reasonably and effectively [2–4].
Rapid and continuous advancements in technology have changed our lives in many ways, and people are paying more attention to their health [5]. The nutritional value of food, the combination of ingredients, and the effects of eating play vital roles in people’s health. However, the amount of recipe information required is large, and the information is complex. Recipe knowledge involves nutrition, health, gardening, traditional Chinese medicine and other fields, and is distributed across multiple sources, making it difficult for users to find the required information. Inefficiency increases the difficulty in user selection. Users require extensive search and integration to obtain the required recipe knowledge. Only effective knowledge organisation and integration can help users find information quickly and accurately.
Google coined the term ‘knowledge graph’ in 2012. A knowledge graph is a knowledge organisation system. These graphs have been used in many knowledge-based applications, such as medicine [6], education [7], agriculture [8] and entertainment [9]. The construction of a knowledge graph is conducive to the organisation, querying and dissemination of knowledge. As a large-scale semantic network, it plays an important role in automatic question answering, knowledge recommendation and other services. Researchers have built recipe knowledge graphs, recipe recommendations and recipe innovation systems. However, most current knowledge graphs and systems are not driven by user demands [10].
An adequate system needs a good knowledge base that satisfies the demands of users. In this study, we built a general recipe ontology based on user demands to achieve efficient and accurate retrieval of recipes. We selected recipes involving common edible flowers with health benefits, analysed the characteristics of the flower recipes, constructed a flower recipe knowledge graph, and verified its effectiveness. The main contributions of this study are as follows:
Focusing on user knowledge demands based on Chinese recipe websites and books, user knowledge demands for recipes were analysed, including basic, functional and cultural demands. According to the knowledge demands of the users, we built a general recipe ontology.
By collecting recipes for common edible flowers with health benefits from multiple sources and considering the characteristics of flower recipes, an ontology of flower recipes was constructed. Based on the natural language processing and machine learning methods, a recipe knowledge graph was constructed, and its effectiveness was examined.
The remainder of this article is organised as follows. Section 2 discusses related work on recipe ontologies and strategies for constructing a knowledge graph. Section 3 presents the methodology and overall framework of the study. Section 4 describes the process of analysing user knowledge demands and building a knowledge graph of flower recipes (including the construction of schema and data layers). Section 5 discusses the effectiveness of the knowledge graph of flower recipes. Section 6 summarises the conclusions of the study, discusses the limitations and presents suggestions for future work.
2. Related work
2.1. Research on recipe ontology
As users pay more attention to health, recipes have attracted increasing research attention. Research on recipe ontologies has mainly focused on three aspects: recipe recommendation, recipe innovation and recipe retrieval.
With regard to recipe recommendations, researchers have mainly discussed ways of recommending suitable recipes for specific groups of users. Faiz et al. constructed an ontology model based on four aspects: personal health information, disease, diet and fitness. They proposed a recommendation system for fitness methods and suitable recipes for patients [11]. Yusof and Noah [12] combined case-reasoning ideas and a food ontology model to design a recipe recommendation scheme for diabetic patients. In the study of Bianchini et al. [13], recipes were classified through ‘CourseType’, ‘CookingStyle’, ‘FoodCategory’ and their sub-concepts. These ontology models mainly aim to recommend suitable food recipes for patients. However, Sihwi et al. designed a recipe recommendation system for children aged 6–24 months. The model includes nine categories such as infant age, ingredients, seasonings, cooking methods, nutrition and so on [14]. Lertkrai et al. [15] helped Thai kindergarten children find recipes and achieve a balanced diet by building an ontology model involving health indicators, diet, cooking methods and calories. Although multi-faceted recipe ontology models have been established, they are mainly aimed at children. The model structures cannot directly satisfy the needs of most users.
Regarding recipe innovation, Lo et al. [16] built recipe ontology models involving three categories – ingredients, recipes and diseases – and calculated the nutritional value of recipes via machine learning methods. Makino et al. [17] developed a desert food ontology and generated new recipes with regard to three aspects: the basic ingredients, decorative ingredients and production process. Pini et al. [18] used artificial intelligence technology to investigate the creation of new recipes based on the similarity and novelty of ingredients in the constructed knowledge graph as well as the taste and nutritional value of recipes. For recipe innovation, researchers mainly focus on ingredients and combine machine learning models with similarity algorithms.
To achieve more accurate recipe retrieval, researchers have considered many aspects in ontology models. For example, Helmy et al. [19] built a retrieval system by integrating nine types of ontologies, including recipes, nutrition and diseases, to acquire information regarding food, health and so on. According to the demands of users in the recipe-making process, Manna et al. [20] incorporated the ingredients, measuring tools and production process involved in the recipe into a question-and-answer search. In 2017, a rich recipe ontology was published on the OpenKG 1 website, which involved attributes such as taste, ingredients, cooking time, difficulty and steps. In addition, scholars have discussed ingredient replacement [21,22] and food database management [23] based on recipe ontologies.
2.2. Knowledge graph construction strategy
Researchers have found that transforming text into complex network models can be useful for analysing text and finding similar text [24,25]. This provides convenience in recipe recommendation, innovation and retrieval. A knowledge graph is a semantic-network knowledge base that reveals the relationships between various concepts or items in the real world [26,27]. Compared with hierarchical and relational models, the knowledge graph can more effectively deal with the complex and dynamic changes in the internal relationships between the data and provide a new method for solving the problem of information retrieval [28].
A knowledge graph can be divided into schema and data layers from a logical viewpoint [27]. The schema layer is the conceptual model and logical basis of the knowledge graph, which formally represents the association between various things in the real world, including concepts, relationships and attributes [29].
The schema layer of the knowledge graph mainly adopts an ontology model to constrain the data layer. For the automatic construction of ontology, Krishnan et al. [30] proposed the use of WordNet to extract the semantic concepts, hypernyms and hyponyms of each word in the text. It can be used for the automatic construction of an ontology. Aiming at the automatic construction of an ontology in a digital library, Sie et al. proposed a method for combining results from specific knowledge networks with the automatic ontology generation of digital library metadata to produce a human-readable, semantic-oriented hierarchy. The method is based on the existing schema hierarchy information stored in a digital library and requires a data foundation [31]. Asim et al. discussed the ontology construction process with regard to three aspects: linguistics, statistics and logic. They summarised the automatic realisation of entity extraction and relation extraction through semantic analysis, co-occurrence analysis, cluster analysis and other techniques, and compared the advantages and disadvantages of various methods. They found that a hybrid approach involving linguistic and statistical techniques results in better ontologies [32]. For manually building ontologies, METHONTOLOGY [33,34], the five-step cycle methodology [35], the skeletal methodology [36,37] and the seven-step methodology [38] are the most widely used methods. Each ontology construction method is developed for a specific field and cannot be applied directly to other fields. There are deficiencies and defects in building knowledge graphs with different requirements.
In the data layer of the knowledge graph, data are mainly stored in the form of triples of ‘entity–relationship–entity’ or ‘entity–attribute–attribute value’ [29]. Neo4j is a widely used database in which graphs consist of numerous points and relationships. The nodes and relationships can have properties in the form of key–value pairs [39]. Neo4j can analyse complex relationships between large amounts of data and process complex relationships significantly faster than traditional databases [40,41]. Researchers have used it to build knowledge graphs, such as typhoon disaster knowledge graphs [42], film knowledge graphs [43], protein knowledge graphs [44] and Asian music knowledge graphs [45]. In addition, the application of natural language processing and machine learning technology can assist in the extraction of structured data and improve the efficiency of the data layer construction of the knowledge graph [46,47].
From related works, we found that research on recipe recommendation, innovation, retrieval, database management and so on is abundant; however, relatively a few ontology studies on Chinese recipes have been performed. Healthy diets have become a trend, and it is crucial to satisfy users’ recipe retrieval demands in multiple dimensions. A knowledge graph can provide a model for information retrieval. However, some of the existing recipe ontology models are relatively rough, with few dimensions, and others have too many dimensions, that is, they lack versatility and cannot achieve efficient recipe recommendation and query functions. From the perspective of meeting the knowledge demands of users, a more refined and comprehensive general recipe ontology model was constructed in this study. Flower cultivation in China has a history of more than 7000 years. In addition to ornamental value, flowers have edible value. The construction of a knowledge graph of flower recipes is conducive to the realisation of digital humanities. We selected recipes for common edible flowers with health benefits and obtained a knowledge graph of flower recipes through the construction of the schema and data layers. In addition, we examined its effectiveness. This study lays a foundation for the development of knowledge services, such as efficient recipe retrieval, recommendation and intelligent question answering.
3. Methodology and overall framework
3.1. Methodology for constructing recipe knowledge graph
3.1.1. Methodology for constructing schema layer
The schema layer of the knowledge graph is mainly composed of structures of ‘concept–relation–concept’ and ‘concept–attribute–data type’, and it can be regarded as the ‘skeleton’ of the knowledge graph. The ontology model is mainly used as the schema layer of the knowledge graph. The ontology construction methods are discussed in section 2, and each method has a particular emphasis. Although these methods have strong versatility, when they are directly applied to the construction of knowledge graphs with strict practical requirements, there are deficiencies and defects.
In this study, we refer to the improved domain ontology construction method proposed by Liu [48]. The main process is described as follows. According to an extensive collection of recipe information, we determine the scope of entities and abstract the ontology and related terms according to the knowledge demands of users. We then revise the abstracted ontology to construct a general recipe ontology model that embodies concepts, attributes and relationships between concepts. Finally, in the process of instantiating the general recipe ontology model, improvements are made according to actual application scenarios. This method not only considers the top–down design but also considers the bottom layers of actual application scenarios and collects entity data to enrich and revise the design of the ontology, so that accurate knowledge retrieval and recommendation can be achieved.
3.1.2. Methodology for constructing data layer
Neo4j 2 is a graph database management system that consists of numerous nodes, relationships and properties. As mentioned in related work, it has a wide range of uses. Considering its efficient query performance and powerful flexibility, the Neo4j graph database is selected for storage in the data layer of the edible flower recipe knowledge graph [49]. We show the complex many-to-many relationship of the flower recipe ontology through the data association features of nodes, relationships and attributes. We use the powerful Cypher query function in Neo4j to discuss the effectiveness of the knowledge graph [50].
3.2. Research ideas and framework
First, we obtained recipes and users’ questions and answers data from websites and books to analyse users’ knowledge demands for recipes. Next, we constructed a general recipe ontology model combined with the existing ontology model. Then, we selected the recipes for common edible flowers, combined the application scenarios, and determined the schema layer of the knowledge graph of flower recipes according to the general recipe ontology model. Subsequently, we used machine learning to extract some entities to obtain structured data and used Neo4j to construct the data layer to obtain the knowledge graph of flower recipes. Finally, we examined its effectiveness in combination with user knowledge demands. The research framework is presented in Figure 1.

Overall framework of recipe knowledge graph construction.
3.3. Data collection and preprocessing
The object of recipe knowledge graph construction is the flower industry, which is growing with rural revitalisation. Flower cultivation in China has a history of more than 7000 years. In addition to ornamental value, flowers have edible value. According to ancient literature, chrysanthemum has long been used as an edible flower. In ‘Li Sao’, Qu Yuan reported a custom of eating flowers at that time: ‘朝饮木兰之坠露兮, 夕餐秋菊之落英 (drinking the falling dew of magnolia in the morning and eating the fallen flowers of autumn chrysanthemum in the evening)’. In the Han Dynasty, people drank chrysanthemum wine and orchid wine, and in the Tang Dynasty, people ate ‘百花糕 (hundred flower cakes)’. In the Song Dynasty, they ate flower soup, and in the Yuan Dynasty, they drank scented tea. Many flowers were used as food during the Qing Dynasty. Eating flowers has become part of food culture. Edible flowers contain several substances that are beneficial to the human body. Building a recipe knowledge graph for edible flowers may help users determine the nutritional value and efficacy of edible flower recipes in multiple ways.
3.3.1. Multi-source flower recipe data collection
Currently, data on edible flower recipes are incomplete. Along with descriptions of flower recipes in websites and books, we referred to the national edible flower standards and selected chrysanthemum, rose, osmanthus, sophora jasmine, jasmine and other common edible flowers as research objects to construct a data set.
The recipe websites included ‘美食杰 (Gourmet)’, 3 ‘美食天下 (Gourmet World)’, 4 ‘下厨房 (Xia Chu Fang)’ 5 and ‘豆果美食 (Douguo Food)’. 6 These websites are popular with users in China and have a large number of users. They have a huge amount of recipe data, covering almost all recipe data. Most recipes include introductions, ingredients, methods, effects and cooking techniques. Some recipes also involve cuisine categories, ingredients, tips and so on. Regarding recipe books, we primarily searched those related to ‘edible flowers’ from many large digital libraries in China, such as CADAL (China Academic Digital Associative Library, an international cooperation programme for university digital libraries), ‘Chaoxing Duxiu APP’, Nanjing Agricultural University Library and so on. These digital libraries cover a large number of books. It can help us collect as much relevant data as possible. After screening, we selected books such as ‘Flower Dishes’, ‘Flower Diet’, ‘Flower Health Food’, ‘Chinese Flower Medicinal Diet and Dietary Therapy’, ‘Common Flower Diet’ and ‘Chinese Flower Health Recipe’. We used scanning and optical character recognition to obtain relevant text data and performed manual proofreading.
Data on nutrients are typically obtained from the nutrient query website and the ‘Chinese Food Composition Table’; however, the nutrient data of flowers could not be retrieved. Therefore, we obtained the nutrient content data of the related flowers detected in the articles by retrieving journal papers and dissertations on the nutrient components of related flowers from the China National Knowledge Infrastructure (CNKI) and other databases.
3.3.2. Flower recipe data preprocessing
An analysis revealed problems in the collected data; thus, the data were preprocessed as follows:
In some cases, the contents of the recipes were repeated. By comparing recipes from different sources, duplicate recipes were eliminated. However, we kept recipes with the same ingredients and different cooking processes and assigned the recipe numbers.
Taking ‘peach blossom’ as an example, some recipes listed ‘peach blossom’ as ingredients but were unrelated to peach blossoms, such as peach blossom shrimp and peach blossom octopus. In these recipes, ‘peach blossom’ referred not to the flowers but to ‘walnut peanuts’ or ‘peach gum’.
In some cases, the recipe names were misleading. For example, in the ‘Fried Shark’s Fin with Osmanthus Flower’ and ‘Hibiscus Crucian Carp’ recipes, sweet-scented Osmanthus and hibiscus flowers, respectively, are not used for cooking. We found that ‘sweet-scented osmanthus fried xx’ mostly refers to frying with egg liquid or egg yolk liquid containing various ingredients. ‘Hibiscus’ is an elegant name for eggs as an ingredient in recipes and is not related to edible flowers. Thus, irrelevant recipes were deleted.
After data preprocessing, 6064 flower recipes remained. The data covered structured knowledge such as flower recipes, culinary crafts, flowers, auxiliary ingredients, ingredients, production processes and nutrients. However, some data were unstructured text, such as recipe introductions and recipe reviews. We discuss the use of machine learning models to perform named entity recognition for extracting entities in section 4.2.1.
4. Construction of flower recipe knowledge graph for user knowledge demands
4.1. Construction of schema layer of flower recipe knowledge graph
In this study, a general recipe ontology model was constructed by analysing user knowledge demands for recipes, and according to the characteristics of flower recipes, a flower recipe knowledge graph schema layer based on the general recipe ontology model was constructed.
4.1.1. Analysis of user knowledge demands for recipes
Diet is an important part of people’s lives, and eating healthily is a concern. The user knowledge demands for recipes play a decisive role in the effectiveness of the construction of recipe knowledge graphs. An analysis of the data from recipe websites and books revealed that the user knowledge demands for recipes have the following characteristics:
Basic demands: user demands for recipes mainly include ingredients, practices, cooking techniques, taste and cuisine. These requirements are the most common knowledge requirements for users. For example, about the users’ questions and answers data from recipe websites, users ask how to make a recipe, what ingredients are needed and so on. The ingredients, cooking methods, tastes and other contents of the recipe are introduced in the recipe book.
Functional demands: as people pay more attention to their health, users are increasingly concerned with the function of recipes, including nutrients (such as vitamins, proteins and phosphorus), suitable groups (such as the elderly and children), symptoms that can be relieved (e.g. ‘applicable to users with vomiting and nausea’ and ‘for users with weak spleen and stomach’) and effects such as face whitening, eyesight improvement and soothing. Although basic demands are essential, functional demands determine the users’ choice of recipes. When material needs are satisfied, emphasis is placed on improving physical fitness, and the function of recipes plays a key role.
Cultural demands: in addition to satisfying user demands at the material level, recipes play a significant role in culture. There is a saying in China that ‘民以食为天 (people take food as the sky)’; accordingly, Chinese traditional customs always involve eating. Users require recipes for special festivals. For example, rice dumplings are eaten at the Dragon Boat Festival, dumplings are eaten at the Spring Festival, and spring pancakes are eaten on the first day of spring. Cultural demands have also become an important part of the demand for recipe knowledge.
In addition, some users ask questions regarding the substitution of ingredients and the compatibility of ingredients, for example, ‘Can white vinegar be replaced with rice vinegar?’, ‘Can olive oil be replaced with cooking oil?’ and ‘Can crab and tomatoes be eaten together?’. Therefore, the compatibility between ingredients is also one of the knowledge demands of the users. According to the analysis of user demands for recipe knowledge, we constructed a recipe ontology model that provided a basis for the knowledge graph of flower recipes.
4.1.2. Construction of general recipe ontology model
The recipe ontology model is constructed by first considering the basic demands of users, including the recipe name, cooking process, cuisine type and ingredient category. Second, considering the functional demands of users for recipes, the nutrient composition, suitable population, efficacy and symptom that can be relieved are used as categories in the recipe ontology. We then define the attributes of these categories. The cultural demands of users are mainly related to the seasons and festivals when recipes are eaten. Therefore, we created an attribute under the ‘Recipe’ category called ‘Season for Eating’. In the process of defining attributes, we found that in the actual classification, the cuisine types of recipes (such as home-cooked dishes, Shanghai dishes, staple dishes, snacks and breakfast) are vague and difficult to determine. Users are typically not concerned with these sub-divisions. Therefore, we eliminated the ‘Cuisine Type’ category and made it an attribute of the ‘Recipe’. The ‘Suitable Population’ category mainly involves the elderly, children and so on and does not need to be divided into specific age groups or identities, and most recipes do not involve this category; thus, it was also adjusted to be an attribute in the ‘Recipe’ category. According to the above analysis, we selected the most important and frequent attributes of each category, as shown in Table 1.
Categories and attributes of recipe ontology.
An analysis revealed that the two categories of ‘Symptom’ and ‘Efficacy’ are relatively special, only the ‘Name’ attribute is defined, and all attributes are defined as strings. According to the set of six categories, that is, ‘Recipe’, ‘Ingredient’, ‘Nutrition composition’, ‘Culinary craft’, ‘Symptom’ and ‘Efficacy’, semantic relationships were created, as shown in Table 2.
Semantic relationships between categories.
A general ontology model of the recipes was constructed according to the categories, attributes and semantic relations, as shown in Figure 2. On this basis, we selected the recipes for common edible flowers with health benefits to constructing a recipe knowledge graph.

General model of recipe ontology.
4.1.3. Construction of flower recipe ontology model
The knowledge graph schema layer mostly adopts an ontology model. According to the constructed general recipe ontology model, we analysed the characteristics of flower recipes to construct a flower recipe ontology model for the schema layer of the knowledge graph. In practical application scenarios, a comprehensive understanding of flowers is an important aspect for enhancing the attractiveness of flower recipes; thus, we considered flowers as a category that is beneficial for facilitating user retrieval and improving retrieval efficiency. Therefore, we set flowers as a separate category from ingredients and set the related attributes: ‘Name’, ‘Alias’, ‘Genus’, ‘Edible Curative Effect’, ‘Culinary Application’ and ‘Edible Taboo’. In the semantic relationship of categories, there is a ‘Belong_to_Flower’ relationship between the recipe categories and flower categories and a ‘Has_Nutrition’ relationship between flower categories and nutrient categories. The constructed flower recipe knowledge graph model layer is illustrated in Figure 3.

Flower recipe knowledge graph schema layer.
4.2. Construction of data layer of flower recipe knowledge graph
4.2.1. Knowledge extraction of flower recipe instances
In the process of knowledge graph construction, because some efficacy and symptom data are unstructured, knowledge entities cannot be quickly extracted from data; therefore, named entity recognition is a key step. We selected 400 pieces of online text data and 300 pieces of data from books as the corpus from preprocessed text data, such as flower recipe introductions and recipe reviews. We also compared the effects of several machine learning models on knowledge entity recognition. After manual annotation, model training and model testing, it was found that the bidirectional long short-term memory (BiLSTM)-conditional random field (CRF) model, which can effectively combine contextual semantic information, achieved the best results. We applied this model to unlabelled data for the automatic extraction of knowledge entities and performed manual proofreading to obtain structured data. The specific process is as follows:
1. Flower recipe knowledge entities labelling.
First, we performed data annotation on the corpus of flower recipes. According to the data query and manual judgement, two types of knowledge entities – efficacy and symptoms – were marked. In the labelling method, ‘[@" and "*]’ were used to represent the left and right boundaries of the entity, and ‘Efficacy’ and ‘Symptom’ were used to represent the category of the entity. An example of a labelled sentence is as follows:
菊花是中国常用中药,具有(Chrysanthemum is a commonly used traditional Chinese medicine in China, with the effects of) [@疏风(evacuate the cold)#Efficacy*], [@清热(clearing heat)#Efficacy*], [@明目(brighten the eyes)#Efficacy*], [@解毒(detoxify)#Efficacy*]之功效。主要治疗 (It main treats the symptom of)[@头痛(headache)#Symptom*], [@眩晕(dizzy)#Symptom*], [@目赤(conjunctival hyperemia)#Symptom*], [@心胸烦热(sphoria with feverish sensation in the chest)#Symptom*], [@疔疮(boils)#Symptom*], [@肿毒(swelling toxin)#Symptom*]等症.
In this sentence, ‘疏风 (evacuate the cold)’, ‘清热 (clearing heat)’, ‘明目 (brighten the eyes)’ and ‘解毒 (detoxify)’ are descriptions of efficacy. ‘头痛 (headache)’, ‘眩晕 (dizziness)’, ‘目赤 (conjunctival hyperemia)’, ‘心胸烦热 (dysphoria with feverish sensation in the chest)’, ‘疔疮 (boils)’ and ‘肿毒 (swelling toxin)’ are descriptions of symptoms, which were marked with symptom entities. After the annotation was completed, the annotators exchanged corpora to ensure the accuracy and completeness of the annotation. Finally, the experimental data set was obtained.
2. Flower recipe knowledge entity automatic extraction
The BiLSTM-CRF model is based on the BiLSTM and CRF models. In a semantic environment that fully considers the context, combining the learned rules to constrain the predicted labels can significantly improve the performance of the model. In the experiment, a cross-validation method was adopted. We compared the BiLSTM-CRF model with hidden Markov model (HMM), BiLSTM and CRF models. We used 90% of the labelled data set as the training set and 10% as the test set. The Python programming language was employed, and TensorFlow was used to build the model. The number of iterations was set to more than 50, and the results of each iteration were stored. The optimal model was determined in the experiment, as shown in Table 3.
Best results of machine learning models.
HMM: hidden Markov model; BiLSTM: bidirectional long short-term memory; CRF: conditional random field.The bold values represent the best results in the F1-scores columns.
The results indicated that the BiLSTM-CRF model outperformed the other three models. It can recognise some entities in the corpus. For the identification of both efficacy and symptom entities, F-scores of >85% were achieved, with an average F-score of 87.16% for the two types of entities. This indicates that the proposed method can be used for automatic entity extraction. We used the model to automatically extract the efficacy and symptom entities from unlabelled texts, and then manually proofread and organised them.
Through knowledge extraction and fusion of knowledge entities with different aliases of the same ingredient, we counted 13,278 entities in the flower recipe data. There were 54,221 relationship pairs between the entities. The entity statistics are presented in Table 4, and the relationship statistics are presented in Table 5.
Statistical table of the number of flower recipe entities.
Statistical table of the number of flower recipe relationships.
4.2.2. Flower recipe knowledge graph construction
The knowledge graph in Neo4j is composed of three modules: nodes, relationships and attributes. According to the constructed schema layer of the flower recipe knowledge graph and the structured data, we constructed the data layer of the flower recipe knowledge graph. The names of the categories in the schema layer were used as the nodes in the data layer, the attributes in the schema layer were used as the attributes of each node in the data layer and the relationships between the categories in the schema layer were used as the relationships between nodes in the data layer. The py2neo 7 package in Python was used to create a knowledge graph stored in Neo4j. When creating relationships between recipes, flowers, ingredients and nutrients, numeric attributes were stored as part of the relationships. For example, for ‘the nutrients contained in chrysanthemum (per 100 g) are 242 calories, 63 g carbohydrates and so on’, we used the ‘create_relationship’ function from the py2neo package in Python. The code was as follows: ‘create_relationship (‘菊花 (chrysanthemum)’, ‘热量 (calorie)’, ‘Has_Nutrition’, {‘Content’: ‘242 kcal/100 g’})’ and ‘create_relationship (‘菊花 (Chrysanthemum)’, ‘碳水化合物 (Carbohydrate)’, ‘Has_Nutrition’, {‘Content’: ‘63 g/100 g’})’. The constructed flower recipe knowledge graph is shown in Figure 4.

Knowledge graph of flower recipes, where the entities are observed. Blue nodes represent the flower recipes, for example, ‘Chrysanthemum wine’ and ‘Mulberry and chrysanthemum porridge’. The red node represents the flower entity, that is, ‘Chrysanthemum’. Purple nodes represent the ingredients, for example, ‘Rice’, ‘Wolfberry’ and ‘Angelica’. Green nodes represent the symptoms, such as ‘Headache and tinnitus’ and ‘Dizziness’. Pink nodes represent the effects, for example, ‘Nourishing yin and liver’ and ‘Sharpen hearing and brighten eyes’. The yellow node represents the culinary craft, which is ‘Brewing’.
5. Discussion
In this study, the Cypher language in Neo4j was used to build queries [51], which involved recipe ingredients, cooking techniques, efficacy and symptomatic relief. Taking ‘What are the recipes that contain ‘Osmanthus’ in all flower recipes?’ as an example, we used the command ‘match(n)-[: Belong_to_Flower]->(: Flower{name:‘桂花 (Osmanthus)’}) return n’ to obtain the user knowledge demands. Part of the query results are shown in Figure 5. For complex relational queries, ‘What are the recipes that use boiled cooking technology and have beauty benefits?’ is taken as an example. The corresponding command was ‘match(: Efficacy{Name:‘美容 (Cosmetic)’})<-[t: Has_Efficacy]-(n)-[r:Need_Craft]->(:Craft{Name: ‘煮 (Boil)’}) return n, r, t, limit 10’, and the query result is shown in Figure 6.

Partial results of a recipe query containing ‘Osmanthus’. The red node represents the entity of ‘Osmanthus’. The blue nodes represent flower recipes, such as ‘Osmanthus flowers with rice dumpling soup’ and ‘Brown sugar osmanthus cake’.

Results of a complex relationship query. The pink node represents the efficacy, that is, ‘Cosmetic’. The yellow node represents the culinary craft, which is ‘Boil’. The blue nodes represent the flower recipes, such as ‘Rose tremella soup’ and ‘Homemade rose brown sugar ginger tea’.
As indicated by the example query demonstrations in Figures 5 and 6, the knowledge graph of flower recipes constructed in this study can effectively provide the information required by users. In addition, when users query a specific flower recipe, it can find recipe content that is similar in one or several characteristics, according to the association between recipe nodes, which can provide users with multiple choices. The knowledge graph of flower recipes can satisfy user knowledge demands, such as flower recipe retrieval and recipe recommendation.
6. Conclusion and future work
The nutritional value, efficacy and combination of ingredients in food are closely related to people’s health. Recipe knowledge is scattered across multiple data sources such as websites, books and journals. Prior to this study, no recipe knowledge graphs based on user knowledge demands had been constructed. Some existing recipe ontology models are rough, with a few dimensions, and others have too many dimensions, that is, they lack versatility and cannot achieve efficient recipe recommendation and query functions. In this study, the knowledge demands of users were analysed from multi-source recipes, and a general recipe ontology model was developed. The edible value of flowers has a history of thousands of years in China. The construction of a knowledge graph of flower recipes is conducive to the realisation of digital humanities. Considering the characteristics of flower recipes, an ontology and a knowledge graph of recipes for common edible flowers with health benefits were constructed. The knowledge graph was applied, and the results confirmed that it can accurately provide the knowledge users need and satisfy the basic, functional and cultural knowledge demands of users. Thus, the recipe ontology model constructed according to user knowledge demands has effectiveness and generalisation, as well as the constructed flower recipe knowledge graph has application value.
However, the constructed flower recipe knowledge graph cannot realise the query function for ingredient substitution, and it is necessary to add a method for calculating the similarity between ingredients. We will conduct follow-up research to improve the construction method for the recipe knowledge graph. In addition, using a recipe knowledge graph to predict the nutritional value and efficacy of recipes, as well as for recipe innovation, is suggested as a future research direction.
Footnotes
Acknowledgements
The authors would like to thank the China Scholarship Council (CSC No. 202106850069) for its support for the visiting study.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by the National Social Science Foundation of China under grant no.19BTQ036.
