Recommendation Research of Science and scientific and technological reports for the Needs of Science and Technology Enterprises

: To improve the scientific research and innovation ability of science and technology enterprises, we recommend science and technology reports for the individual demandss of science and technology enterprises, and provide scientific theory and technical support. This paper proposes a technology report recommendation method that integrates feature value and time value for the demandss of science and technology enterprises. Firstly, the TF-IDF algorithm and word2vec language model are used to construct the feature vector of technology enterprise demand. Secondly, LDA, TF-IDF and word2vec are used to construct the feature vector of scientific and technological reports, and the time value of scientific and technological reports is calculated by referring to the negative exponential equation of literature aging and the half-life index of downloads. Finally, the similarity is calculated between the technology enterprise demand feature vector and the technology report vector, and the final recommendation results are obtained by sorting and comparing with other methods. The empirical results show that the scientific and technological reports recommended by this paper's methodology not only meet the demandss of science and technology enterprises, but are also novel and cutting-edge.


Introduction
Science and technology report is the carrier to describe the application process of cutting-edge core technology, and is the basic strategic resource of the country. In order to promote the construction of a powerful country in science and technology, in accordance with the deployment of the action to promote the transfer and transformation of scientific and technological achievements, the Ministry of Science and Technology focused on the scientific and technological achievements generated by the national "863", "973", the national key research and development project plan, the national science and technology support plan and other financial science and technology plans, and summarized and released a number of advanced and applicable scientific and technological achievements in line with the direction of industrial transformation and upgrading, covering the new generation of information, energy, modern agriculture Key core technologies, cutting-edge leading technologies and scientific and technological innovation theories in 11 technical fields such as high-end equipment and advanced manufacturing.

Related work
At present, research on recommendation of scientific and technological resources mainly focuses on literature recommendation and patent recommendation. The research methods of document-based recommendation include content filtering, collaborative filtering and graph-based recommendation. Content-based filtering uses the text information of the document as the object, and recommends documents with high similarity. Xiong Huixiang et al. [1] believe that the recommendation papers with high reference value have time value and characteristic value. They divide the semantics of keywords into research topics, research scope, theoretical technology and other four categories, establish a cooccurrence matrix, calculate the similarity between the keys, and integrate the factors of the aging of the literature to make list recommendations. Li Yamei et al. [2] used the LDA theme model to construct the situational theme preferences of scientific researchers for the scientific and technological documents used by scientific researchers, and then used the similarity calculation to find scientific research groups in the same situation and combine the theme preferences, and recommended appropriate scientific and technological documents based on the situational demandss of the target scientific researchers. Collaborative filtering uses scholars to recommend relevant papers based on literature scoring information. Wu Lei et al. [3] Graph-based recommendation mainly uses knowledge atlas and graph-related algorithms for recommendation. The research methods based on patent recommendation mainly include subject model, machine learning and deep learning. The subject model is used to mine patent texts and find text collections with the same subject. Liu Wei et al. [4] proposed a patent recommendation algorithm based on topic classification and semantic similarity. Use Bert to extract keywords and transform word vectors from patent titles and abstracts, use DBSCAN clustering method to construct patent subject domain categories based on word vectors, and combine with text similarity framework SimNet to form an overall analysis model. The main commonly used methods of machine learning include SVM, decision tree, naive Bayes, random forest, etc.Li Zhenyu et al. [5] proposed a cross-domain patent knowledge recommendation method based on in-depth learning to solve the problem of recommending crossdomain patents. The patented problem space and knowledge space are generated by using the semisupervised learning algorithm (TG-TCI) and the named entity recognition algorithm BERT-BiLSTM-CRF, respectively. Finally, the validity and feasibility of the model are verified by evaluating and analyzing actual cases. At present, the research of scientific and technological reports is mainly focused on quality assessment and control, status analysis, institutional rules and academic cooperation, and the relevant research recommended for scientific and technological reports is relatively small. Lu Linna and Yuan Fang [6]recommend suitable scientific research scholars and research institutions by using network analysis and documentary metrology. The topic model method can divide the document collection into different categories, take into account the factors of document word frequency and word co-occurrence, highlight the affiliation between documents and topics, ignore the characteristics of documents under the same topic, and lack the semantic relationship extraction between words; In addition, the use value of scientific and technological reports is related to time. The newer the time of publication, the more novel the theories and technologies proposed and adopted, and the more valuable they can be used and studied. Therefore, in order to further develop scientific and technological report resources and expand the service methods and utilization methods of scientific and technological reports, this paper proposes a scientific and technological report recommendation method that integrates characteristic value and time value, fully utilizes the latest scientific and technological report results and theoretical technology, provides appropriate scientific research resources for the demandss of scientific and technological enterprises, and fills the research gap of the role of scientific and technological reports in the actual demandss of scientific and technological enterprises.

Framework for scientific and technological
report finding The core of the recommendation model is the construction of the demand vector of scientific and technological enterprises and the feature vector of scientific and technological reports. The demand information of science and technology enterprises includes the technical bottleneck and the theoretical technology demandsed by science and technology enterprises in their research and development. Use TF-IDF algorithm to extract the key information of the requirement text, and use word2vec word vector model to vectorize the keywords to form the requirement vector. In the construction of the feature vector of scientific and technological reports, the LDA theme model is used to divide the scientific and technological report documents into different scientific and technological topics, and the probability distribution of scientific and technological reports-scientific and technological topics-subject words is used to obtain the corresponding subject characteristics of each scientific and technological report; TF-IDF algorithm is used to extract the key words of the report as personality characteristics, and the word2vec word vector model is used to transform it into a scientific report vector; Considering the timeliness of the role of scientific and technological reports, the time value of scientific and technological reports can be obtained by using the negative exponential equation of document aging and the half-life index of download quantity to construct the calculation formula of report aging; Combine the theme feature vector, personality feature vector and time value to get the feature vector of scientific and technological report; Finally, the similarity between the demand vector and the report vector is calculated and sorted, and the final recommendation result is obtained and analyzed.

Construction of demand characteristic
vector of science and technology enterprises TF-IDF algorithm is a common weight calculation method to identify feature words from the set of requirements documents of science and technology enterprises. TF represents word frequency, IDF represents inverse document frequency, and TF-IDF is used to calculate the weight of a keyword in the scholar keyword set. The calculation method is Eq.(1),i refers to the key number of the requirement document; W ti is the content weight of keyword t i ;tf(t i ,d) is the frequency of the keyword t i appearing in the keyword set d of the requirement document .|D| is the number of requirement documents; Df (ti) is the number of documents containing the keyword t i in the number of requirements documents W tf t , d * log

Construction of scientific and technological report vector
In this chapter,we using the LDA topic model and TF-IDF algorithm to obtain the topic characteristics and personality characteristics respectively. In addition, we creating time value to evaluate the aging degree of scientific and technological reports. Using the divided scientific and technological report text into the LDA model, the probability distribution between the number of scientific and technological topics, the number of scientific and technological report-themes and the relationship between scientific and technological topic-themes is obtained.tw ni is the subject word i in subject n, vec (t i ) represents the vector of science and technology subject i, tdp nm represents the report m of report subject n, and vec (d i ) represents the vector of report i . (3) In addition to topic characteristics, individual characteristics are also an important part of highlighting the information of scientific and technological reports, mainly used to represent the unique information of scientific and technological reports.Follow eq.(1),using TF-IDF to calculate the word weight of each science and technology report. After ranking, take the weight value of top20 words as the report's personality token words. Based on the negative exponential equation of document aging, this paper constructs the time value measurement index of scientific and technological reports and calculates the aging degree of scientific and technological reports,as shown in Eq. (5). Where,h i is the aging degree of the scientific and technological report i, T life is the number of years that the scientific and technological report has been published, a is the aging rate, and T is the half-life. (5) To sum up, the theme feature vector, personality feature vector and time value coefficient of the science and technology report are fused to get the science and technology report vector, as shown in Eq.(6).

Scientific and technological report vector finding
The cosine similarity calculation of the feature vector set of demand for scientific and technological enterprises and the feature vector of scientific and technological reports obtained above is shown in Eq.(7). The ranking of the report and the similarity score are obtained after sorting, and the top results are selected for recommendation.

Experiments
We takes electronic information as an example to collect the demand data of scientific and technological enterprises in the industry from 2018 to 2021, with a total of 243 records. The data of science and technology reports are mainly obtained from the website of the National Science and Technology Report Service System, with a total of 36006 items. The demand text of science and technology enterprises was screened, and the data without title, demand details and duplicate data were eliminated, and 229 pieces of data were finally obtained. Select the scientific and technological report data from 2015 to date, eliminate null values and duplicate data, and finally get 12807 valid data.
The segmentation text of scientific and technological service demand and the segmentation document of scientific and technological report are combined, and the ship-gram model in word2vec is used for training. The sliding window size of the model is 2, and the vector dimension is 200. The half-life of the applied computer discipline is about 3, so T=3 is taken as the half-life of the downloaded scientific and technological reports of the discipline, and the aging rate a is taken as 0.5.

Result
To evaluate the effectiveness of this method, two other methods are selected for comparison, namely, LDA-based thematic feature recommendation method and TF-IDF personality feature recommendation method. From accuracy, time value evaluation method performance. The precision is calculated by the proportion of the number of reports in the recommended report list that meet the scientific and technological demandss of the enterprise to the total number of recommendations. Let R(p) be the list of predictive recommendations for enterprise demand P given by the model on the test set, while T(p) is the list of recommended scientific and technological reports for effectively matching enterprise demand for science and technology. The definitions of the evaluation indicators are as follows: Precision: We calculates the accuracy (Fig.1) and time value (Fig.2) of Top-N (N=5, 10, 15, 20) in the recommendation list by randomly sampling 20% of the demand data of scientific and technological enterprises as a test set.The experimental results show that the performance of this method is better than the other two comparison methods, which can not only meet the demandss of scientific and technological enterprises, but also provide more timevalue scientific and technological reports to enterprises.

Conclusion
We proposes a recommendation method for scientific and technological reports based on the integration of theme characteristics, personality characteristics and time value for the needs of scientific and technological enterprises. with the recommendation based on the theme method and the recommendation based on the personality method alone. The research results show that the fusion model proposed in this paper has a good effect, can provide scientific and technological reports with novel time and consistent research content for the needs of scientific and technological enterprises, and provide scientific theory and method support for enterprises to carry out product research and technological innovation. This paper has the following deficiencies: in terms of the representation of the demand vector of technology enterprises, the semantics of the demand text of technology enterprises are not considered, and only the keyword method is used for representation. In terms of vector representation of scientific and technological reports, only the information technology enterprise needs are recommended, which has the limitations and particularity of the scenario. In the future, different industry objects in the scenario will be analyzed to improve the robustness of the model. At the same time, due to the lack of actual download data of scientific and technological reports, the aging rate and half-life of scientific and technological reports are not precise enough. These problems deserve further study.