abstract |
The invention discloses a description-based patent classification method, belonging to the fields of text processing and data mining. First, text preprocessing is performed on the patent specification; then an inverted index file is constructed, and feature words are selected using a feature selection method combining information gain and word frequency; further, the improved TF-IDF formula is used to calculate the weight of feature words, and construct Patent feature vector; then build a training patent field set; finally use the optimized KNN classifier to classify patents. This study provides a new idea for the classification of patent documents, and also lays the foundation for further research on intelligent retrieval of patent documents. |