Predicate |
Object |
assignee |
http://rdf.ncbi.nlm.nih.gov/pubchem/patentassignee/MD5_b4f8a66440d9eac562732a3e4ac67704 |
classificationCPCAdditional |
http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G05B2219-41054 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/F02D41-1405 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G05B2219-25255 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/Y10S128-925 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06N3-00 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06N3-02 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/F03D7-046 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06N7-046 |
classificationCPCInventive |
http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G01C21-3602 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/B60W30-00 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06N3-08 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06N3-045 http://rdf.ncbi.nlm.nih.gov/pubchem/patentcpc/G06N3-04 |
classificationIPCAdditional |
http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/F02D41-14 http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/F03D7-04 http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G06N3-00 http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G06N3-02 http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G06N7-04 |
classificationIPCInventive |
http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G06N3-08 http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/G06N3-04 |
filingDate |
2018-01-19-04:00^^<http://www.w3.org/2001/XMLSchema#date> |
grantDate |
2022-11-29-04:00^^<http://www.w3.org/2001/XMLSchema#date> |
inventor |
http://rdf.ncbi.nlm.nih.gov/pubchem/patentinventor/MD5_0b6e20c7e7e152790c488b2a72f96a49 |
publicationDate |
2022-11-29-04:00^^<http://www.w3.org/2001/XMLSchema#date> |
publicationNumber |
US-11514305-B1 |
titleOfInvention |
Intelligent control with hierarchical stacked neural networks |
abstract |
A neural network method, comprising: modeling an environment; implementing a policy based on the modeled environment, to perform an action by an agent within the environment, having at least one estimated dynamic parameter; receiving an observation and a temporally-associated cost or reward based on operation of the agent in the environment controlled according to the policy; and updating the policy, dependent on the received observation and the temporally-associated cost or reward, to improve the policy to optimize an expected future cumulative cost or reward. The policy may represent a set of parameters defining an artificial neural network having a plurality of hierarchical layers and having at least one layer which receives inputs representing aspects of the received observation indirectly from other neurons, and produce outputs to other neurons which indirectly implement the policy, the plurality of hierarchical layers being trained according to respectfully distinct training criteria. |
isCitedBy |
http://rdf.ncbi.nlm.nih.gov/pubchem/patent/CN-116151890-A http://rdf.ncbi.nlm.nih.gov/pubchem/patent/CN-116151890-B |
priorityDate |
2010-10-26-04:00^^<http://www.w3.org/2001/XMLSchema#date> |
type |
http://data.epo.org/linked-data/def/patent/Publication |