patent/US-11532378-B2

http://rdf.ncbi.nlm.nih.gov/pubchem/patent/US-11532378-B2

Outgoing Links

Predicate	Object
assignee	http://rdf.ncbi.nlm.nih.gov/pubchem/patentassignee/MD5_3c35a206188eedb4e5e4f7e22fa5b067
classificationCPCAdditional	http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06N3-047 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06N3-045
classificationCPCInventive	http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06F16-2255 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06N3-0445 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G16B30-10 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06N3-044 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06F16-24534 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06N3-08 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06N3-084 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G16B40-20
classificationIPCInventive	http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G06N3-04 http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G16B30-10 http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G06F16-22 http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G16B40-20 http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G01N33-50 http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G06F16-2453 http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G06N3-08 http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G01N33-48
filingDate	2021-11-23-04:00^^<http://www.w3.org/2001/XMLSchema#date>
grantDate	2022-12-20-04:00^^<http://www.w3.org/2001/XMLSchema#date>
inventor	http://rdf.ncbi.nlm.nih.gov/pubchem/patentinventor/MD5_c6e0018ff0c74489db2b53b40f749bd6 http://rdf.ncbi.nlm.nih.gov/pubchem/patentinventor/MD5_c57e9db2714fe431c776ac73fa6a1c89
publicationDate	2022-12-20-04:00^^<http://www.w3.org/2001/XMLSchema#date>
publicationNumber	US-11532378-B2
titleOfInvention	Protein database search using learned representations
abstract	A method for efficient search of protein sequence databases for proteins that have sequence, structural, and/or functional homology with respect to information derived from a search query. The method involves transforming the protein sequences into vector representations and searching in a vector space. Given a database of protein sequences and a learned embedding model, the embedding model is applied to each amino acid sequence to transform it into a sequence of vector representations. A query sequence is also transformed into a sequence of vector representations, preferably using the same learned embedding model. Once the query has been embedded in this manner, proteins are retrieved from the database based on distance between the query embedding and the protein embeddings contained within the database. Rapid and accurate search of the vector space is carried out using exact search using metric data structures, or approximate search using locality sensitive hashing.
priorityDate	2020-11-23-04:00^^<http://www.w3.org/2001/XMLSchema#date>
type	http://data.epo.org/linked-data/def/patent/Publication

Incoming Links

Predicate	Subject
isCitedBy	http://rdf.ncbi.nlm.nih.gov/pubchem/patent/CN-111696624-A
isDiscussedBy	http://rdf.ncbi.nlm.nih.gov/pubchem/compound/CID3000322 http://rdf.ncbi.nlm.nih.gov/pubchem/substance/SID226406433

Total number of triples: 32.