贪心科技|让每个人享受个性化教育服务Review:PaperReadingLSTM:ASearchSpaceOdyssey范老师2020/05/24贪心科技|让每个人享受个性化教育服务Content•Introduction•VanillaLSTM•HistoryofLSTM•EvaluationSetup•ResultsandDiscussion•Conclusion贪心科技|让每个人享受个性化教育服务Introduction•LSTMsarebothgeneralandeffectiveatcapturinglong-termtemporaldependencies.•ThecentralideabehindtheLSTMarchitectureis-amemorycell:记录每时刻的状态-non-lineargatingunits:正则化进/出memorycell的信息•Eightdifferentvariantsonthreebenchmarkproblems:acousticmodeling,handwritingrecognitionandpolyphonicmusicmodeling•EachvariantdiffersfromthevanillaLSTMbyasinglechange.贪心科技|让每个人享受个性化教育服务VanillaLSTM贪心科技|让每个人享受个性化教育服务VanillaLSTM-ForwardPass贪心科技|让每个人享受个性化教育服务VanillaLSTM–BackpropagationThroughTime贪心科技|让每个人享受个性化教育服务HistoryofLSTMOriginalLSTM:1.cells,inputandoutputgates2.onlythegradientofthecellwaspropagatedbackthroughtimeA.ForgetGate:ForresettingownstateB.PeepholeConnections:FormakingprecisetimingC.FullGradient:FullpropagationthroughtimetrainingD.OtherVariants:基于模型结构和训练方式的改造,尤其是GRU贪心科技|让每个人享受个性化教育服务EvaluationSetup1.ThevanillaLSTMisusedasabaselineandevaluatedtogetherwitheightofitsvariants.2.Eachvariantadds,removes,ormodifiesthebaselineinexactlyoneaspect.3.Theyareevaluatedonthreedifferentdatasetsfromdifferentdomains.4.每个变种使用randomsearch单独调参贪心科技|让每个人享受个性化教育服务EvaluationSetupA.DatasetsTIMIT:acousticmodelingdataset,classificationIAMOnline:handwritingdatabase,JSBChorales:巴赫的几百首四声部的音乐集B.NetworkArchitecture&Training-JSBChorales:asingleLSTMhiddenlayer+sigmoidoutputlayer-TIMITandIAMOnlinetasks:BidirectionalLSTM-Trainingwithstochasticgradientdescent-Stopsat150epochsornoimprovementafter15epochs贪心科技|让每个人享受个性化教育服务EvaluationSetup-C.LSTMVariants贪心科技|让每个人享受个性化教育服务EvaluationSetup-D.HyperparameterSearch1.Randomsearchforeachcombinationof9variants+3data...