Sunday 15 April 2018

Estratégias de negociação usando aprendizagem em máquina profunda


Aprendizado de Máquina para Negociação.
Oferecido na Georgia Tech como CS 7646.
Programa de Nanodegree.
Inteligência artificial.
Aprenda a construir o impossível.
Acelere sua carreira com a credencial que leva você rapidamente ao sucesso no trabalho.
Sobre este curso.
Este curso apresenta aos alunos os desafios do mundo real de implementar estratégias de negociação baseadas em aprendizado de máquina, incluindo as etapas algorítmicas desde a coleta de informações até as ordens do mercado. O foco é sobre como aplicar abordagens de aprendizado de máquina probabilística para decisões de negociação. Consideramos abordagens estatísticas como regressão linear, KNN e árvores de regressão e como aplicá-las a situações reais de negociação de ações.
Custo do curso.
Aprox. 4 meses.
Nível de habilidade.
Incluído no curso.
Conteúdo rico de aprendizado.
Ministrado por profissionais da indústria.
Comunidade de Suporte ao Aluno.
Junte-se ao caminho para a grandeza.
Este curso gratuito é o seu primeiro passo para uma nova carreira com o Programa Nanodegree de Inteligência Artificial.
Curso Livre.
Aprendizado de máquinas para negociação.
Melhore o seu conjunto de habilidades e aumente a sua hirabilidade através de uma aprendizagem inovadora e independente.
Programa Nanodegree.
Inteligência artificial.
Acelere sua carreira com a credencial que o acompanha rapidamente ao sucesso do trabalho.
Leads do curso.
Tucker Balch.
Arpan Chakraborty.
O que você aprenderá.
Este curso é composto por três mini-cursos:
Mini-curso 1: manipulação de dados financeiros no Python Mini-curso 2: Investimento computacional Mini-curso 3: Algoritmos de Aprendizado de Máquinas para Negociação.
Cada mini-curso consiste em cerca de 7 a 10 lições curtas. Atribuições e projetos são intercalados.
Estudantes da OMS em queda de 2015: haverá dois testes - um meio de meio após mini-curso 2 e um exame final.
Pré-requisitos e requisitos.
Os estudantes devem ter fortes habilidades de codificação e alguma familiaridade com os mercados de ações. Nenhuma experiência financeira ou de aprendizado de máquina é assumida.
Observe que este curso atende alunos com foco em ciência da computação, bem como alunos de outras áreas, como engenharia de sistemas industriais, gerenciamento ou matemática, que têm experiências diferentes. Todos os tipos de alunos são bem-vindos!
Os tópicos do ML podem ser & quot; revisão & quot; para estudantes de CS, enquanto peças de finanças serão revisadas para estudantes de finanças. No entanto, mesmo se você tiver experiência nesses tópicos, você achará que os consideramos de uma maneira diferente da que você já viu antes, em particular com o objetivo de implementar para negociação.
A programação será principalmente em Python. Utilizaremos inúmeras bibliotecas numéricas como NumPy e Pandas.
Por que tomar este curso.
No final deste curso, você deve ser capaz de:
Compreender as estruturas de dados utilizadas para negociação algorítmica. Saiba como construir software para acessar dados de capital vivo, avaliá-lo e tomar decisões comerciais. Compreenda 3 algoritmos de aprendizagem de máquina populares e como aplicá-los a problemas comerciais. Entenda como avaliar o desempenho de um algoritmo de aprendizado de máquina para dados de séries temporais (dados de preço de ações). Saiba como e por que as técnicas de mineração de dados (aprendizagem em máquina) falham. Construa um sistema de software comercializado que usa dados diários atuais.
Algumas limitações / restrições:
Usamos dados diários. Este não é um curso HFT, mas muitos dos conceitos aqui são relevantes. Nós não interagimos (negociamos) diretamente com o mercado, mas vamos gerar alocações de ações que você poderia negociar se quisesse.
O que eu ganho?
Vídeos de instrutor Aprenda fazendo exercícios Ensinados por profissionais da indústria.
Cursos relacionados.
Aprendizado de máquinas: Aprendizagem não supervisionada.
Informática de saúde na nuvem.
Big Data Analytics em saúde.
IA baseada em conhecimento: sistemas cognitivos.
Tecnologia Educacional.
Previsão de séries temporais.
Cursos populares.
Inteligência Artificial - Pesquisa e Otimização.
Soluções de login sem senha para iOS.
Programas em destaque.
Apenas no Udacity.
Programas.
O negócio.
"Nanodegree" é uma marca registrada da Udacity. &cópia de; 2011 & ndash; 2018 Udacity, Inc.
Udacity não é uma universidade credenciada e não conferimos graus.

Estratégias de negociação usando aprendizagem em máquina profunda
Thomas Wiecki mencionou isso alguns anos atrás (ele omitiu os espaços, então procure por "ApplyingdeepLearningtoEnhanceMomentumTradingStrategiesinStocks" no tópico sobre ideias de negociação).
Takeuchi, L., Lee, Y. (2013). Aplicando Aprendizagem Profunda para Melhorar as Estratégias de Negociação Momentum em Stocks.
Usamos um autoencoder composto de máquinas de Boltzmann restritas empilhadas para extrair características da história dos preços das ações individuais. Nosso modelo é capaz de descobrir uma versão aprimorada do efeito momentum em ações sem engenharia manual extensa de recursos de entrada e entregar um retorno anualizado de 45,93% em relação ao período de teste 1990-2009 versus 10,53% para o impulso básico.
Alguém com um chefe de ciência de dados pode criar uma versão Q disso?
Totalmente fascinante. É impossível comentar nesta fase, mas há um grande número de perguntas para as quais precisarei procurar respostas. Para mim, a frase mais interessante é a seguinte:
nosso modelo não está apenas redescobrindo.
padrões conhecidos nos preços das ações, mas indo além do que.
Os humanos conseguiram alcançar.
É REALMENTE possível que o aprendizado profundo obtenha um conjunto simples de retornos e melhore as previsões & quot; feito pela aplicação de uma estratégia de impulso simples? Este documento parece indicar que esse é o caso.
Tendo lido o artigo duas ou três vezes, ainda não estou claro exatamente o que cada & quot; stack & quot; Na verdade, sim, mas sem dúvida, acabarei por cair em algum tipo de conclusão.
Felizmente, este artigo surge em um momento em que eu decidi me aposentar da pesquisa incrivelmente chata que fiz até agora. Eu decidi "aprender" AI e aprendizagem profunda. Ou, pelo menos, tentar.
Estou longe de ter certeza de que tem alguma aplicação para a previsão de longo prazo dos preços das ações, mas este artigo parece sugerir o contrário. Estou ansioso para descobrir se essa pesquisa realmente descobriu o Eldorado ou se outros fatores estão em jogo, o que tornará esta linha de pesquisa tão infrutífera quanto a maioria dos outros nos mercados financeiros.
O treinamento de uma rede neural profunda em dados quantopian seria desafiador a menos que você pudesse executar os notebooks / algoritmos em máquinas com poderosas GPUs anexadas.
Se você tiver acesso off-line a dados de negociação relevantes, você poderia treinar uma rede a partir de máquinas não-quântico e, em seguida, traduzir a rede resultante para execução no framework quantopiano.
Muito interessante para ler alguns dos outros artigos de Stanford sobre aprendizagem profunda aplicada aos mercados. O documento referenciado reivindica apenas uma precisão de classificação de 50% sobre se os negócios acabarão vencedores ou perdedores no mês seguinte. Apenas usando o preço como entrada.
O modelo é correto 53,84% do tempo quando ele prevê a classe 1 e a.
um pouco inferior a 53,01% do tempo em que prevê a classe 2.
Considere que uma típica tendência à moda antiga não adornada tipicamente oferece 40% de negociações e lucros ganhadores, executando vencedores e perdedores de corte.
Se funcionasse em 2013, isso funcionaria mais? Eu pensaria que bancos e casas de corretagem teriam exércitos de doutores escrevendo um código como esse.
Muitas pessoas pensam assim. E eu sei o que você quer dizer. Mas se é verdade, então você também pode desistir completamente. Como pode Quantopian. Não tenho ideia de se ainda funciona, mas pretendo replicar o estudo. Tudo o que tenho certeza é a minha própria ignorância.
Houve um fio por enquanto, onde alguém tentou isso usando uma das bibliotecas de aprendizagem de máquinas em uma única ação:
Previsão de movimentos de preços através de regimes e aprendizado de máquina.
Pode ser um bom lugar para começar.
Corre muito devagar. Para acelerar as coisas, você pode querer baixar os dados de preços do EOData (ou outro site) e trabalhar a partir dele em sua própria máquina.
Anthony, encontrei este código de aprendizagem da máquina Python (e curso MOOC associado) e pensei que você poderia achar útil: johnwittenauer / machine-learning-exercises-in-python-part-1 /
Outro grupo postou números de precisão ainda melhores (82% vs 53%). Ainda não tenho certeza sobre a qualidade.
Você provavelmente poderia apenas contactar os autores sobre sua implementação.
Os RBMs podem ser feitos em R com deepnet.
Interessante. A metodologia no link Springer também se baseia no preço apenas como entrada, embora talvez não se surpreenda com a maior precisão: esta prevê 1 minuto de antecedência, enquanto o projeto de Lee prevê um mês à frente.
Estou me concentrando em Python, Keras e Theano. bem como skLearn.
O papel está livremente disponível em algum lugar?
Anthony - Sim, algumas implementações de diferenças. Python pode fazer chamadas para R se precisar. Tentei usar o PDNN para python?
Meu conhecimento atual é infantil. Estou começando do zero em todo o tópico e criando ANNs do zero para a experiência usando alguns livros didáticos Noddy. Estou interessado em todo o campo, de modo a olhar para quaisquer técnicas de ML que possam ser úteis, incluindo mecanismos de argolas.
A minha suspeita é que, no que diz respeito ao investimento a longo prazo, tudo isto será uma perda de tempo. Ou melhor, que não me proporcionará retornos ajustados ao risco melhor do que o sistema simples 50/50 que descrevi em meu site.
Mas vamos ver. Eu estou tão ansioso para disparar as luzes fora como qualquer outra pessoa, mas sei por experiência que esses empreendimentos geralmente se revelam de forma bastante diferente do que se poderia ter esperado!
Quando eu estiver um pouco mais longe, entrarei em contato com Takeutchi Lee e verei o que ele fez (com alguma coisa) com a estratégia delineada. Eu me pergunto se ele realmente trocou isso? Ou para ele ou para seus empregadores.
Patrick: Obrigado.
Gosh, acabei de notar isso no documento referenciado:
Os dados usados ​​para treinamento e testes são as transações tick-by-tick da AAPL.
de setembro a novembro de 2008.
1 estoque testado por 3 meses! Eu estou surpreso que eles não levaram um pouco mais do que isso, mas quem sabe que talvez o resultado tenha sido o mesmo para diferentes ações e períodos?
Oi Anthony e grupo. Dois problemas:
Quantas tentativas foram envolvidas na obtenção desse desempenho superior? Isso nao esta claro. Eles ajustaram os parâmetros do RBM até obterem o resultado desejado? Além do viés de look-ahead, que eles afirmam não ser um problema, há também viés de snooping e seleção de dados. Na verdade, o viés de seleção pode ser bastante grande.
O documento foi publicado no final de 2013, mas a amostra de amostra de teste terminou em 2009. Não há razão para isso, exceto no caso em que o desempenho superior veio de vendas a descoberto durante os mercados de baixa de 2000 e 2008, caso em que desapareceu após 2009.
As alegações de desempenho superior do momentum por parte dos Glabadanidis foram recentemente desmascaradas pelo Prof. Zakamulin depois que ele mostrou que havia um viés antecipado nos cálculos. Mais sobre isso e outras questões, também em relação às condições especiais do mercado que dão origem a t-stats altas, em meu artigo recente, papers. ssrn / sol3 / papers. cfm? Abstract_id = 2810170.
Alguém já examinou a técnica proposta por Lee et al. Estou tendo uma oportunidade (usando dados Quandl gratuitos), mas acho difícil encontrá-lo. Eu posso lidar com os aspectos ML. Mas eu não tenho certeza de como estão empacotando os dados.
Eu acho que é algo assim:
Para um determinado momento no tempo para uma determinada ação, podemos construir um item de treinamento (rotulado) usando os 13 meses anteriores (e o valor subseqüente de 1 mês) de dados diários para essa ação.
Usamos esses dados para construir 12 retornos acumulados mensais que terminam um mês menos do que o momento. Então, eu acho que apenas adicione os preços diários de Adj_Close & amp; cuspir o valor cada 30 ou mais passa. Agora fica interessante. Eles fazem o mesmo por todas as outras ações neste momento, e obtêm um valor z para nosso estoque sobre este conjunto (ou seja, o número de desvios padrão da média). Assim, o movimento desse valor z mostra o crescimento desse estoque em particular em relação a todo o mercado. Uma vez que o algoritmo vai ser investido uma certa quantia de dinheiro no mercado, e apenas deslocá-lo entre ações, isso é o que você quer!
Parece que eles fazem isso para cada um dos 12 cumrets mensais.
E então eles fazem o mesmo processo nos últimos 30 dias.
Isso realmente faz muito sentido porque você deseja alimentar dados com média de 0 rodada sobre o intervalo (-1, +1) em seu NN.
Então, isso abrange dados de entrada. (há uma entrada extra que é um indicador de início do ano.) Mas um item de treinamento de supervisor completo também requer um valor de saída associado. Parece que eles estão apenas usando se essa ação particular subiu no No mês seguinte, embora eu não entenda o idioma deles, eles falam sobre "acima da média". o preço um mês depois é maior ou menor do que o preço neste momento em particular e a saída 1 ou 0 de acordo com isso? Acho que é isso que irei fazer, pois não entendo o que eles estão dizendo.
Então eu só posso assumir que tudo é encaminhado por um único dia no algoritmo é repetido para gerar outra amostra.
Parece estranho para mim que eles não façam uso do volume diário.
Na verdade, tive uma chance de implementá-lo na minha máquina local no TensorFlow usando dados do yahoo que baixei. "acima da mediana" apenas significa "acima da mediana de retornos percentuais para cada estoque para esse mês". Apenas verificar se o preço era maior ou menor em termos absolutos para o mês (e não se fosse maior ou menor em relação a todos os outros movimentos de estoque) provavelmente seria menos efetivo. Eles são consistentes ao usar essa abordagem relativa, pois todos os recursos de dados de retorno são z-score para cada passo de tempo do mês.
Fiz backtested em tirolesa e, até agora, não consegui duplicar seus resultados estelares, mas ainda estou esperançoso, pois atualmente meu código não usa um sistema autônomo baseado em RBM. codificador (I re-codificará esta parte quando tiver tempo) e também não estou treinando o auto-codificador ou a rede completa para muitos ciclos na minha máquina de GPU única. Também acho que poderia adicionar dados históricos para ações agora extintas (em vez de apenas as atualmente comercializáveis ​​que estou usando como meu "universo") que obteriam melhores resultados de extração de recursos na fase de codificação automática. Não ficou claro para mim se eles fizeram isso ou não. Claro, esses dados históricos antigos teriam que vir de outra fonte (não dados do Yahoo grátis). Eles estão treinando com dados de 1965 a 1989, o que não é um monte de dados para uma rede neural profunda (e provavelmente muito antiga para o modelo resultante ter algum valor prático para a negociação no presente). .
By the way, esses caras pareciam ser capazes de reproduzir os resultados de papel branco com os mesmos recursos de entrada e um modelo de aprendizado de máquina ligeiramente diferente: math. kth. se/matstat/seminarier/reports/M-exjobb15/150612a. pdf.
Tão aproximadamente 53℅ predição correta no teste? Em 53 casos, a rede previu ações que acabaram na metade superior dos retornos no período seguinte? Muito semelhante.
Nenhum backtest fornecido mas. como eu digo muito melhor do que muitos sistemas TF de longo prazo.
Sim, supostamente 52,89% no artigo que eu referenciei, embora eu não esteja obtendo esses resultados em meu próprio código (ainda). Sim, está muito ruim, não existem dados de backtest fornecidos. Este algoritmo é definitivamente a longo prazo, baixa freqüência (você executa uma vez por mês e mantenha suas posições durante todo o mês), embora certamente possa ser modificado para ter um prazo mais curto. Eu pretendo brincar com ele usando dados de minuto também, eventualmente, e diferentes frequências de negociação nos dados mensais / diários.
O jornal Takeichi também não mencionou o vol e o drawdown. Provavelmente seja muito alto, eu imagino. Também todos os tipos de outros problemas, como um viés dependendo da data de realocação e Deus sabe o que mais. Mas coisas interessantes. Pessoalmente, um período de espera mensal não me preocuparia se o retorno fosse realmente bom. Mas, para ser honesto, depois de anos de enganar-me, eu sou bastante ictérica sobre o teste de qualquer sistema que seja usado.
Na minha humilde experiência, retornos anualizados acima de 15% são devidos a um viés de antecipação ou de ajuste excessivo. O mercado não permite esses altos retornos, porque um investidor alavancado o possuiria a mais longo prazo. Então, esses pesquisadores acadêmicos estão sendo enganados por backtesting e sua advertência mais séria que é o "sem impacto" nos preços.
Se você der uma olhada no meu post de agosto de 124, há uma menção dos documentos por Glabadanidis sobre momentum de série de preços que foram declarados seminais com retornos na ordem de 15% apenas para serem refutados recentemente por Zakamulin por ser o resultado do look - ahead viés. Estamos falando de algos simples aqui, mas o código implementado no excel tem um viés de frente. Imagine o que pode dar errado com complicados ML algos nesse domínio. Eu uso o teste de curva de equidade geométrica. Se for válido, a probabilidade de uma falha no backtest é & ​​gt; 95%.
Michael Harris, você pode estar certo, e obrigado por tentar me salvar de mim mesmo e dos acadêmicos impraticáveis, mas eu decidi que eu seria feliz reproduzindo esses retornos mesmo que faltasse. Nesse ponto, se eles pareciam muito bons para ser verdade, eu tentei separá-los para encontrar viés / snooping / over-fitting. O principal ponto para mim foi realmente um exercício de aprendizado do TensorFlow e a aplicação de técnicas de aprendizado profundo para dados de séries temporais financeiras. Eu acredito que existem padrões a serem extraídos desse tipo de dados usando a abordagem de aprendizado profundo, embora talvez o modelo baseado em momentum que este algoritmo particular ML produza não será lucrativo. A grande coisa sobre as redes neurais profundas é que, uma vez que você tenha o fluxo de dados básico para baixo e que a estrutura da rede seja declarada, é fácil alimentar os dados diferentes que você acha que podem ser preditivos e produzir um modelo com um comportamento completamente diferente. Também é relativamente fácil modificar a estrutura da rede e é muito fácil ajustar os parâmetros para ver se eles produzem melhores resultados de teste, embora, como você mencionou, se feito de maneira inadequada, eu entendo que há risco de ajuste excessivo. Eu ainda tenho muito a aprender sobre as armadilhas, então obrigado pelas palavras de advertência.
Justin Weeks, talvez você tenha entendido mal, não comentei seu trabalho e esforços, mas em documentos acadêmicos com resultados que não podem ser replicados e até mesmo conter erros sérios, assumir e demonstrar a falta de compreensão dos mercados e das negociações.
Se você prestar atenção aos resultados desse documento, os seguintes problemas estão presentes:
Ensaios repetidos até que os autores obtenham um bom resultado. Isso introduz um viés de dados de bisbilhotagem. Eles não ajustam seu t-stat para o que mostra falta de compreensão dos perigos da mineração de dados.
Os maiores ganhos são entre 1990 e 2001, provavelmente um ajuste longo durante a tendência de alta mais forte na história do mercado de ações e um curto ajuste em excesso durante o crash do ponto com.
Os autores não relatam métricas importantes, como drawdown máximo, índice de Sharpe e índice de payoff.
Infelizmente, o ambiente acadêmico sabe enganar os executivos da empresa com promessas de altos retornos e autores de papéis similares obtêm altas posições pagantes e antes de serem demitidos acumulam boa riqueza à custa de analistas honestos que nunca relatam números de retorno anuais irrealistas e aplique uma verificação de realidade para reduzir o viés de mineração de dados. Essas pessoas honestas não têm resultados impressionantes para mostrar, mas apenas a realidade e elas nunca passarão como a porta de um grande banco de investimento ou fundo de hedge.
Todo o artigo foi uma demonstração de como se pode usar o ML para ajustar os dados e gerar retornos irrealistas, ao mesmo tempo que obscurece os fatos.
Oh Deus, você realmente deve ouvir Michael. Ele está tão certo. Eu estou sentado aqui escrevendo outro livro - meus publicistas tolos voltaram para mais. Eu queria ter uma primeira seção inteira sobre o que NÃO fazer e escreveu alguns capítulos sobre a tolice de confiar nos testes de volta no comércio probabilístico.
Os editores me pedem para não: os leitores só querem ouvir o que funciona aparentemente.
Eu estou realmente convencido de que o ML é uma ferramenta adequada para seguir a tendência, mas não tenho absolutamente nenhuma dúvida de que um retorno anual de 45% é uma tarefa boba. Ao contrário de Michael, acredito em tendências (pelo menos em ações), embora mesmo lá eu tenha sido enganado e enganado no passado por uma montagem excessiva.
Após 30 anos nos mercados, 15 daqueles gastos em grande parte no comércio sistemático de uma forma ou de outra, eu me sinto profundamente cínico. O mundo dos fundos de hedge geralmente ganha dinheiro para os gestores de fundos que se deslocam com enormes taxas depois que seus fundos caírem. Eles então começam outro.
Parece que temos dois lados do argumento: especialistas em aprendizado de máquina que sabem pouco sobre comércio no mundo real e operadores do mundo real que não têm experiência em aprendizado de máquina.
Eu tenho adicto ao ML. Se eu puder desenvolver algoritmos de negociação lucrativos, ótimo! Caso contrário, nunca perca, há muitas opções de resposta decente.
Eu não vejo qualquer sugestão de desonestidade intelectual nesse papel de Lee, no entanto eu concordo que é irritante que os jornais têm permissão para publicar resultados sem o suporte de código.
Se alguém estiver interessado em conversar com ML, entre em ## machinelearning no IRC Freenode.
Justin - obrigado por essa resposta, e pelo link!
PS Eu olhei através do papel Patrick ligado (link. springer / chapter / 10.1007 / 978-3-319-42297-8_40), parece muito incompleto. No entanto, o papel original parece robusto, tanto quanto eu posso ver. Continuarei a tentar replicá-lo.
"especialistas em aprendizado de máquinas que conhecem pouco o comércio do mundo real e comerciantes do mundo real que não possuem experiência em aprendizado de máquina".
Isso alude a uma falsa dicotomia. O comércio mundial real pode ser realizado através de uma variedade de métodos, incluindo ML. A falta de experiência com ML pode não ser uma desvantagem em muitos casos, pois pode salvar muitos exercícios de futilidade.
"Eu não vejo nenhuma sugestão de desonestidade intelectual nesse papel de Lee"
Alguém esperaria que os pesquisadores da universidade estivessem familiarizados com a mineração de dados e o viés de bisbilhotagem de dados. O artigo era sobre p-hacking com ML. Isso é perturbador para um trabalho acadêmico. O número exato de ensaios para chegar ao resultado final deveria ter sido relatado. Mas isso não equivale à desonestidade intelectual, mas à aplicação ingênua do ML.
Bom ponto sobre o código, mas eu suspeito que mesmo se você tivesse o código exato, você ainda não seria capaz de replicar os resultados devido a estocastismos.
você ainda não seria capaz de replicar os resultados devido à estocasticidade.
Você deve conseguir chegar perto. Em aprendizado profundo, parece que os números aleatórios são geralmente usados ​​apenas para a geração dos pesos iniciais. embora eu seja um daqueles com muita experiência em mercados e pouco em ML!
Estas são as taxas de erro de cinco execuções sucessivas de um perceptron Multilayer com exatamente os mesmos parâmetros nos mesmos dados de um projeto que estou trabalhando para um cliente.
Sim, mas eu me pergunto como essas diferenças se traduzem em CAGR em um sistema de negociação? Eu me pergunto se faz tanta diferença que algumas execuções preveem dizer 51℅ dos estoques corretamente a cada mês ou 52,4℅? Conhecer os caprichos do teste de costas eu suspeito que não?
ML é apenas uma equação não linear com 10s & # 39; a 1000 de coeficientes indeterminados para os dados.
Parece que seria impossível evitar a superposição. Em um mercado de tendência ascendente ou descendente, suspeito dos algoritmos ML.
apenas aprenderia regras de momentum.
Se o ML estiver funcionando, acho que você precisará aplicá-lo em vários estoques de uma só vez, lançar dados fundamentais, fatores econômicos etc.
Então, talvez possa descobrir um padrão em um conjunto de dados muito grande para um ser humano de se olhar.
O cérebro humano é muito bom em reconhecer padrões. Se houvesse um padrão no histórico de preços de um estoque único.
Eu acho que você veria isso.
Apenas observando que uma parte fundamental do papel Takeuchi / Lee é "estacionar de papelaria" os dados, transformando-o em um formato transversal.
& quot; calculamos uma série de 12 retornos cumulativos usando os retornos mensais e 20 retornos cumulativos.
usando os retornos diários. Notamos que o momentum de preço.
é um fenômeno transversal com os vencedores.
tendo altos retornos passados ​​e perdedores com baixos retornos passados.
em relação a outras ações. Assim nós normalizamos cada um.
dos retornos acumulados calculando o escore z relativo.
para a seção transversal de todas as ações para cada mês.
Se as estatísticas não forem estacionárias, o modelo não converge ou, se ele consegue convergir (matematicamente), não será muito útil.
David, acho que eles continuaram tentando coisas até obter um resultado impressionante. Esta é a definição de viés de mineração de dados, principalmente impulsionada pelo snooping de dados. Em nenhum lugar em seu artigo, há uma referência ao viés de mineração de dados.
Michael, bisbilhotar dados é definitivamente uma possibilidade. No entanto, a configuração parece bastante plausível para resultados bastante bons ... talvez não seja 50% retorna, mas talvez até 20% em um "padrão" ano. Eu sei que você falou de 15%, mas eu estou otimista, talvez ingenuamente.
Ano normal: o documento não falava de coisas mais interessantes, como mudança de macro (macro), que poderiam afetar os resultados do teste. Por exemplo, o comportamento de impulso pode ser extremamente diferente no último trimestre de 2008, em comparação com o segundo ao quarto trimestre de 2009. Se o seu teste abranger ou perder 2008, poderá mudar os resultados.
Um lugar provável que os dados estão sendo exibidos na configuração (a menos que os autores continuem tentando com configuração diferente) é a porção de validação cruzada. Na minha experiência, este é o local onde "vazamento" pode inadvertidamente ser introduzido no sistema. Por vazamento quero dizer um vazamento de dados futuros. Os autores nunca forneceram os detalhes no x-val de retenção, mas se eles não tiveram cuidado com a forma como eles criaram o conjunto de testes ou conjuntos para a validação cruzada, eles provavelmente cometeu os mesmos erros ao treinar o produto final.
Aqui está uma página Kaggle sobre vazamento:
Do CEO de outra plataforma:
[Muitos desses algoritmos foram desenvolvidos por estudantes usando métodos de aprendizado de máquina sofisticados como redes neurais. "Estou impressionado com a qualidade e estabilidade dos algoritmos de negociação". ]
O aprendizado profundo parece ser muito importante para se manter competitivo.
& quot; Se você tiver acesso off-line a dados de negociação relevantes, poderá treinar uma rede a partir de máquinas não-quatianas e então traduzir a rede resultante para scipy para execução na estrutura quantopiana. & quot;
É que, eu poderia executar no quadro quantopian, mas não conseguiria participar do concurso? Eu tenho os dados relevantes. Estou procurando maneiras de obter algum histórico de papel. Eu poderia usar Interactive Brokers & # 39; negociação de papel, mas é caro ter muitas contas IB.
Greg - obrigado pela informação.
"Corre muito devagar. Para acelerar as coisas, você pode querer baixar dados de preço da EOData (ou outro site) e trabalhar com isso em sua própria máquina. & Quot;
Depois de trabalhar com dados externos e com a própria máquina, existe algum atalho para alterar os códigos para fazer o upload de volta para o quantopian?
& quot; [Muitos desses algoritmos foram desenvolvidos por estudantes usando métodos de aprendizado de máquina sofisticados como redes neurais. "Estou impressionado com a qualidade e estabilidade dos algoritmos de negociação". ] & quot;
Mas a suposição é que ele não conhece os algoritmos. Ou eu estou esquecendo de alguma coisa.
Talvez a próxima mudança de regime de mercado solucione as coisas.
[excluído - veja abaixo o post de Antony]
& quot; Próxima mudança de regime de mercado & quot ;, significa, quando algumas plataformas não podem sobreviver? Muito interessante de qualquer maneira.
Quero dizer, quando a dinâmica do mercado muda, todos os sistemas ML super-ajustados irão falhar.
Mais informações sobre o problema de significância podem ser encontradas no meu artigo: papers. ssrn / sol3 / papers. cfm? Abstract_id = 2810170.
Por enquanto, o impacto dessas competições é pequeno. As mudanças do regime de mercado são impulsionadas por mudanças estruturais (algo comercial no final dos anos 90, decimalização, então HFT, etc.). Na minha opinião os resultados do ensemble são aleatórios priceactionlab / Blog / 2016/09 / data-science /
Não há como distinguir uma baixa perda de registro devido a múltiplos testes de uma significância estatística significativa. Essas competições estão condenadas na minha opinião, pois mais participantes significam que outras convergências da amostra significam para 0 significados verdadeiros. Além disso, eles têm risco de ruína a curto prazo, que é incontrolável, embora pequeno. A chave para os lucros é identificar um ou dois recursos robustos para o regime atual e usar aqueles em algo simples. Tudo mais se traduz em mais parcial, mais ruído, mais risco.
Eu acho que esse tópico se afastou do tópico. Se esse for o caso, os responsáveis ​​podem criar novos tópicos & amp; migrar de acordo? Gostaria de permanecer inscrito neste tópico, mas apenas receber notificações por e-mail relacionadas com o assunto original.
Eu posso implimentation que, trabalhando no mercado indiano, meu interesse é mais em dados de minuto ou cinco minutos. Também pode haver um uso muito melhor de deepnet se você combinar isso junto com os padrões de auto aprendizado.
Alguém tem experiência em como colocar a rede treinada em produção? Para ser mais específico, como salvar o modelo treinado e usá-lo no ambiente de negociação em tempo real. Obrigado.
Apenas fazendo isso com minha própria máquina aprendendo algos nos contratos futuros VIX. Eu informarei quando terminar. Mas eu não vou usá-lo na Q ou esboçá-lo em Q, pois uso preços diários, contratos de futuros e um mecanismo diferente de teste de volta de python.
Eu tenho procurado fazer um back-test desde há muito tempo. Finalmente, eu dei uma facada nele. Aqui estão meus resultados (e configurações):
Número total de tickers: 2,585.
Câmbio: NYSE e NASDAQ.
Faixa de data: 2012-02-21 a 2016-11-29.
Dias úteis: 1.203.
Dados do trem: do início até 2015-12-31.
Dados de teste: De 2016-01-01 até o final.
Rede Neural (Codificador-decodificador)
• Arquitetura o (#nodes em cada camada oculta): (33 i / p) -40-4-40- (33 o / p)
o Função de ativação para camadas ocultas: ReLu.
o Função de ativação para saída: Linear.
• Otimização o batch_size = 100.000.
o Otimizador: Adam (taxa de aprendizado: 0,001)
o Função de perda: mse.
• Desempenho (no conjunto de treinamento) o Perda após 100 épocas = 0.1505.
Rede Neural (Classificador)
• Arquitetura o (#nodes em cada camada oculta): (4 i / p) - & gt; 20- & gt; (1 o / p)
o Função de ativação para camadas ocultas: ReLu.
o Função de ativação para saída: Sigmoid.
• Otimização o batch_size = 100.000.
o Otimizador: Adam (taxa de aprendizado: 0,01)
o Função de perda: binary_crossentropy.
o Regularização: abandono de 40% na camada oculta.
• Desempenho (no conjunto de treinamento) o Perda após 100 épocas = 0,6926.
o Precisão (taxa de classificação): 0.5141.
• Desempenho (no conjunto de teste) o Precisão (taxa de classificação): 0.4844.
• Retorno (decil superior longo e decil inferior curto) = -1,66% (anualizado).
Eu usei dados Quandl (conjunto de dados EOD) para construir os 13 recursos, conforme sugerido no artigo.
Usei diferentes taxas de aprendizagem e abordagens de regularização, mas os resultados não diferem drasticamente. Interstingly, uma abordagem ingênua para ir longo (em todas as ações) em determinado período produz + 19,34% de retorno. Isso não é surpreendente, já que o período de teste é 2016 e o ​​mercado cresceu a uma taxa equivalente.
Ansioso por seus pensamentos.
Eu gosto de seus blogs, mas acho que você está perdendo algo para algo ML agora. Pode ser adaptável se você estiver usando a janela de rolagem com pesos para reciclar. Esse é o mesmo processo que nós, o ser humano, reaprender o novo ambiente. DNN pode precisar de mais dados, mas outros ML algos ainda podem ser úteis. O método no papel pode ter "superado" the strategy in picking up the network architecture but as they are not directly optimizing on the final PnL I think the "overfitting" problem would be less severe than the normal trading system optimization on the final PnL/Sharpe/Sortino.
I have carried out similar experiments on US stocks and I think your training size is a little bit too small. Nevertheless, the system is not doing very well since 2016 in my setups even if I have used cross-validation to tune the nn/ML structure. The best period in my test period(2000-2017) was right after the tech bubble which corroborates figure 4 in the Stanford paper. Post-2000, my monthly return is much lower(
20% CAGR, 1.6 Sharpe, 16% MaxDD) than the # reported in the paper partly because of using only post-2000 in the test sample.
Adding more data may not help, since.
Currently, training data has close to 3 million obs.
I see your point but I think the original paper was forecasting monthly returns instead of daily returns so you would only have 2500*12*5=150K data point. Com.
half for training, you "only" have.
75K data for a deep NN which might be too small?
I guess your usage of forecasting daily returns versus monthly returns might explain why your test resulted in a negative CAGR while mine is still positive albeit much smaller than in the paper.
I, too, forecast monthly returns but, I do not constrain constructing the features for just 1st day of every month. I construct them for every day. This way I have 2,515*1,203=
3M obs. When computing PnL however, I choose a particular day of the month to invest/close a position.
I acknowledge that this way consecutive days will not have much variation in input features/outcome.
Nonetheless, I'll try training on more isolated dates (one each month) as you suggested.
Desculpe, algo deu errado. Tente novamente ou contate-nos enviando comentários.
Você enviou um ticket de suporte com sucesso.
Nossa equipe de suporte estará em contato em breve.
O material deste site é fornecido apenas para fins informativos e não constitui uma oferta de venda, uma solicitação de compra ou uma recomendação ou endosso para qualquer segurança ou estratégia, nem constitui uma oferta de prestação de serviços de consultoria de investimento pela Quantopian.
Além disso, o material não oferece nenhuma opinião em relação à adequação de qualquer segurança ou investimento específico. Nenhuma informação contida neste documento deve ser considerada como uma sugestão para se envolver ou abster-se de qualquer curso de ação relacionado ao investimento, já que nenhuma das empresas atacadas ou nenhuma das suas afiliadas está a comprometer-se a fornecer conselhos de investimento, atuar como conselheiro de qualquer plano ou entidade sujeito a A Lei de Segurança de Renda de Aposentadoria do Empregado de 1974, conforme alterada, conta de aposentadoria individual ou anuidade de aposentadoria individual, ou dar conselhos em capacidade fiduciária em relação aos materiais aqui apresentados. Se você é um aposentadorio individual ou outro investidor, entre em contato com seu consultor financeiro ou outro fiduciário não relacionado a Quantopian sobre se qualquer idéia, estratégia, produto ou serviço de investimento descrito aqui pode ser apropriado para suas circunstâncias. Todos os investimentos envolvem risco, incluindo perda de principal. A Quantopian não oferece garantias sobre a precisão ou integridade das opiniões expressas no site. Os pontos de vista estão sujeitos a alterações e podem ter se tornado pouco confiáveis ​​por vários motivos, incluindo mudanças nas condições do mercado ou nas circunstâncias econômicas.
O material deste site é fornecido apenas para fins informativos e não constitui uma oferta de venda, uma solicitação de compra ou uma recomendação ou endosso para qualquer segurança ou estratégia, nem constitui uma oferta de prestação de serviços de consultoria de investimento pela Quantopian.
Além disso, o material não oferece nenhuma opinião em relação à adequação de qualquer segurança ou investimento específico. Nenhuma informação contida neste documento deve ser considerada como uma sugestão para se envolver ou abster-se de qualquer curso de ação relacionado ao investimento, já que nenhuma das empresas atacadas ou nenhuma das suas afiliadas está a comprometer-se a fornecer conselhos de investimento, atuar como conselheiro de qualquer plano ou entidade sujeito a A Lei de Segurança de Renda de Aposentadoria do Empregado de 1974, conforme alterada, conta de aposentadoria individual ou anuidade de aposentadoria individual, ou dar conselhos em capacidade fiduciária em relação aos materiais aqui apresentados. Se você é um aposentadorio individual ou outro investidor, entre em contato com seu consultor financeiro ou outro fiduciário não relacionado a Quantopian sobre se qualquer idéia, estratégia, produto ou serviço de investimento descrito aqui pode ser apropriado para suas circunstâncias. Todos os investimentos envolvem risco, incluindo perda de principal. A Quantopian não oferece garantias sobre a precisão ou integridade das opiniões expressas no site. Os pontos de vista estão sujeitos a alterações e podem ter se tornado pouco confiáveis ​​por vários motivos, incluindo mudanças nas condições do mercado ou nas circunstâncias econômicas.

Nanalyze.
In previous articles, we’ve defined some of the terms being thrown around lately like “machine learning” and “artificial intelligence“. These disruptive technologies will soon change the world as we know it. While some pundits predicted that we were years away from a computer that could beat a human expert at “Go”, this achievement was recently announced. If a “deep learning” program can now beat a game that has more possible moves than atoms in the known universe, then what’s stopping us from unleashing it upon the stock market and making millions?
The idea of using computers to trade stocks is hardly new. Algorithmic trading (also known as algo trading or black box trading which is a subset of algo trading) has been around for well over a decade and rapidly gaining in popularity. Here’s a look at algorithmic trading as a percentage of market volume:
Source: Morton Glantz, Robert Kissell. Multi-Asset Risk Modeling: Techniques for a Global Economy in an Electronic and Algorithmic Trading Era.
If that trend continues, then this means that today upwards of 90% of trading is being conducted by computer programs. One thing to notice about algorithmic trading is that it has been moving in the direction of shorter and shorter holding times. High frequency trading (HFT) is a subset of algorithmic trading in which stocks are bought and then sold in fractions of a second. This strategy is a form of arbitrage in which the HFT algorithm spots a price discrepancy and then quickly capitalizes on it. As you would expect, HFT trading profits are becoming smaller and smaller but the volume of trades are still dominating the overall market:
Now that we know about algorithmic trading and HFT, just how does machine learning or deep learning come into play? To answer this question, the important variable to take into account is duration. While HFT and algo trading perform trades of a short duration, it becomes much more difficult to “capture alpha” when you start increasing the time frame. The reality is that some of the world’s biggest hedge funds are already all over this space and have been capturing alpha across many durations for a long time now using machine learning.
Early last year, Bridgewater Associates which has $150 billion in assets under management (AUM) started a new artificial intelligence unit led by David Ferrucci who led the development of IBM’s Watson. After working at IBM for 17 years, he was poached by Bridgewater in 2012.
Another firm called Renaissance Technologies has $65 billion in AUM and is said to have “the best physics and mathematics department in the world”. The Medallion Fund at Renaissance, run mostly for employees of the company, has one of the best records in investing history having returned +35% annualized over 20 years. The two co-CEOs of Renaissance were both hired from IBM Research in 1993 where they were working on language-recognition programs.
With $32 billion under management, Two Sigma Investments is known for using AI and machine learning as a key part of their strategy. One co-founder did his PHD in artificial intelligence at MIT and the other was an International Mathematical Olympiad Silver Medalist. Being a finance professional is not a requirement to work at this firm.
While hedge funds such as these 3 are pioneers of using machine learning for stock trading strategies, there are some startups playing in this space as well. Binatix is a deep learning trading firm that came out of stealth mode in 2014 and claims to be nicely profitable having used their strategy for well over three years. Aidyia is a Hong Kong based hedge fund launched in 2015 that trades in U. S. equities and makes all stock trades using artificial intelligence with no human intervention required. Sentient, another deep learning company we discussed before, has developed an artificial intelligence trader that was successful enough that they are consider spinning it out as a prop trading company or asset management firm.
If there’s a startup that shows promise in this space, you can bet that the 3 well established hedge funds we discussed know about it. If you had a machine learning algorithm that generated alpha, would you tell the world about it? Mais provável que não. But then how would you raise the capital needed to make some serious money off of your strategy? Firms like Bridgewater can be as nimble as any startup and at the same time have $150 billion in capital to play with. It’s hard to compete if you’re a startup that’s trying to get funded. If you’re looking for investors, you have to disclose what you’re doing. Word travels fast. It’s not hard to see hedge funds like Bridgewater poaching talent from AI startups that are trying to play in this space and quickly finding out what they’re up to.
For retail investors to take advantage of machine learning for stock trading, you have a couple directions to take. For ultra high net worth retail investors, you can invest your money in one of the hedge funds using AI like Bridgewater or Renaissance. For those of us who don’t have such large amounts of capital, we can wait for deep learning companies like Sentient to go public or be acquired and then invest in those vehicles. We’ll be keeping a close eye on this space because frankly, it’s just fascinating.
Se você pagar mais de US $ 4,95 no comércio, você está pagando demais. Ally Invest é um dos corretores de taxas mais baixas ao redor, então você gasta menos dinheiro em taxas de transação e mais em ações. Com mais de 30 transações por trimestre ele cai ainda mais baixo para US $ 3,95 no comércio. Abra uma conta e comece a negociar hoje.
Published: April 14, 2016.
6 AI Cybersecurity Startups to Watch in 2018.
5 ETFs and Funds Using AI for Stock Selection.
Is Stitch Fix an Artificial Intelligence IPO?
You said: Algorithmic trading (also known as algo trading or black box trading)
Just wanted to point out that not all algo trading is black box.
Thank you for the clarification David! We noted that in the article.
there is an ETF that allows investors to access these technologies today! NYSE listed ticker symbol ‘BUZ’. Learn more at buzzindexes.
Thank you for the comment Jamie! That was a great interview you had on Squawk Box introducing the BUZ ETF.
Obrigado pelas cabeças!
Assine o Weekly Digest.
Assine o Nanalyze Weekly Digest.
Assine nosso Nanalyze Weekly Digest para receber um resumo de todos os artigos por semana.
Nós nunca usaremos seu e-mail para nada além de enviar excelentes artigos sobre investir em tecnologias disruptivas.

US Search Mobile Web.
Bem-vindo ao fórum Yahoo Search! Nós adoramos ouvir suas idéias sobre como melhorar a Pesquisa do Yahoo.
O fórum de comentários do produto do Yahoo agora requer uma ID e senha de Yahoo válidas para participar.
Agora você precisa fazer o login usando sua conta de e-mail do Yahoo para nos fornecer feedback e enviar votos e comentários às ideias existentes. Se você não possui uma ID do Yahoo ou a senha para sua ID do Yahoo, inscreva-se para uma nova conta.
Se você tiver uma ID e senha de Yahoo válidas, siga estas etapas, se desejar remover suas postagens, comentários, votos e / ou perfil no fórum de comentários do produto do Yahoo.
Vote em uma ideia existente () ou publique uma nova ideia ...
Ideias quentes Principais ideias Novas ideias Categoria Estado Meus comentários.
Quando busco meu nome, você publica resultados estranhos. As duas imagens que são eu foram removidas de um site que eu encerrei. Remover.
Ao pesquisar meu nome, estranha propaganda de imagens de palhaço vem para o capitão o palhaço em outro estado, REMOVA-O.
e as imagens.
Todas as coisas tentando implicar coisas estranhas.
O Yahoo pode desenvolver a opção para imagens serem vistas como uma apresentação de slides? Isso ajudaria em vez de ter que percorrer cada imagem e tornar esta experiência do Yahoo mais agradável. Obrigado pela sua consideração.
Não vê a sua ideia? Publique uma nova ideia ...
US Search Mobile Web.
Feedback e Base de Conhecimento.
Dê retorno.
Deutschland Finanzen Mobile DF iOS 1 ideia España Finanzas Mobile DF iOS 7 ideias Contas Painel 33 ideias Opinião do anúncio 3 ideias Respostas TH 31 idéias Respostas TH 0 idéias Respostas Fórum UV (versão de teste) 6 ideias Austrália Celebridades 0 ideias Austrália Finanças Mobile Android 0 ideias Austrália Estilo 0 idéias Austrália Yahoo Tech 0 idéias Autos Impulso 2 idéias Aviate 1,505 idéias Canadá Finanças 1,099 ideias Canadá Finanças Mobile Android 0 ideias Canadá Finanças Mobile DF iOS 3 idéias Canadá Finanças Mobile iOS 465 ideias Canadá Página inicial 5,108 idéias Canadá Filmes 14 ideias Notícias do Canadá 872 ideias Canadá com segurança 10 idéias Canadá Tela 128 idéias Canadá Clima 94 ideias Canadá Yahoo Beleza 0 idéias Canadá Yahoo Celebrity 10 ideias Canadá Yahoo Finanças 0 ideias Canadá Yahoo Filmes 10 ideias Canadá Yahoo Notícias 0 idéias Canadá Yahoo Estilo 21 ideias Futebol universitário Pick & # 39; em 112 idéias TV conectada 361 idéias Corp Mail Test 1 1.313 idéias Corp Mail Testing 1.256 idéias Cricket 19 ideias Daily Fantasy 87 ideias Developer Networ k 1 ideia Double Down 86 ideias Fantasy Baseball 431 ideias Fantasy Basketball 395 ideias Fantasy Football 704 ideias Fantasy Hockey 339 ideias Fantasy Live Scoring no Matchup e Classificações 803 ideias Fantasy Sports Aplicações Android 1.366 ideias Fantasy Sports iOS Apps 2.112 ideias Finanças 1.165 ideias Finanças - CA 493 idéias Finanças - ideias US 9 Finanças ChartIQ 417 idéias Finanças Mobile Web 403 idéias Finanças Portfolios 810 idéias Finanças Triagem de ações 35 idéias Finanças Tablet 44 idéias Flickr - Perfil 290 ideias Flickr Android 60 idéias Flickr para Apple TV 24 idéias Flickr Grupos 12 ideias Flickr Interno 0 ideias Flickr iOS Dogfooding 0 idéias Flickr iPad 125 idéias Flickr iPhone 308 ideias Flickr Nova foto Página 8,030 idéias Flickr Pesquisa 0 ideias Food Revistas 0 idéias Jogos 3,147 ideias Mapas globais 1,021 ideias GS Mobile Web 42 idéias Health Pulse 3 ideias Home Page (Android) 1.689 ideias Home Page (iOS) 3.808 ideias Hong Kong Homepage 0 ideias Índia Celebridade 43 ideias Índia Finanças 493 ideias Índia Página inicial 1.866 i deas Índia Estilo de vida 173 idéias Índia Filmes 84 idéias Índia Notícias 327 ideias Índia Parceiro Portal Tata 0 idéias Índia Parceiro Portal Tikona 0 idéias Índia com segurança 15 idéias Índia Tela 165 idéias Índia Tempo 30 ideias Índia Yahoo Beleza 0 idéias Índia Yahoo Celebridade 4 idéias Índia Yahoo Finanças 0 ideias Índia Yahoo Movies 16 ideias Índia Yahoo Notícias 0 ideias Índia Yahoo Estilo 14 idéias Indonésia Celebridade 38 idéias Indonésia Página inicial 1.151 ideias Indonésia Notícias 170 ideias Indonésia com segurança 29 ideias Indonésia Ela 34 ideias Página inicial da Irlanda 90 idéias Jordânia Maktoob Homepage 418 idéias Comentários sobre o anúncio de correio electrónico 10 ideias Maktoob الطقس مكتوب 5 ideias Maktoob Celebridade 1 ideia Maktoob Entretenimento 10 ideias Maktoob Estilo de vida 0 ideias Maktoob Filmes 2 ideias Maktoob Notícias 182 idéias Maktoob Tela 15 ideias Maktoob Id. de estilo 1 Maktoob ألعاب مكتوب 0 ideias Maktoob شاشة مكتوب 28 ideias Malásia Homepage 17 ideias Malásia Notícias 58 ideias Malásia com segurança 6 ideias Malásia Video 0 ideias Malásia Tempo 1 i dea Merchant Solutions 1 ideia My Yahoo 31,876 ideias Meu Yahoo - back up 1 ideia My Yahoo - US 9,176 ideias Meu arquivo do Yahoo 314 ideias New Mail 9,206 ideias Novo email * 2,709 ideias Nova Zelândia Negócios & Finanças 132 idéias Nova Zelândia Página inicial 1.039 idéias Nova Zelândia com segurança 3 idéias Nova Zelândia Tela 0 idéias Notícias do PH ANC 21 ideias Filipinas Celebridade 214 ideias Filipinas Página inicial 8 ideias Filipinas Notícias 123 idéias Filipinas com segurança 12 idéias Filipinas Vídeo 0 idéias Filipinas Tempo 3 idéias Pick N Roll 19 ideias Postmaster 43 ideias Pro Football Pick & # 39; em 106 idéias Retail Pulse 0 idéias Rivals 11 idéias com segurança 165 idéias Tela para idéias iOS 0 Busca extensões 95 idéias Pesquisa Downloads de produtos 88 idéias Segurança 497 idéias Experiência de login 79 idéias Singapura Entretenimento 20 idéias Cingapura Finanças 230 idéias Cingapura Página inicial 1.048 idéias Cingapura Notícias 212 idéias Cingapura com segurança 11 idéias Cingapura Tela 19 idéias Cingapura Clima 4 idéias Cingapura Yahoo beleza 0 idéias Cingapura Yahoo Ideias da celebridade 4 Cingapura Yahoo Finanças 0 idéias Cingapura Yahoo Filmes 0 idéias Cingapura Yahoo Notícias 0 idéias Singapore Yahoo Style 4 ideas Idéias da celebridade da África do Sul Ideia da África do Sul 374 idéia s África do Sul Notícias 23 ideias Esportes Android 1,533 ideias Esportes CA 34 ideias Esportes iOS 1.025 ideias Esportes Redesign 3.181 idéias SportsReel 6 ideias StatTracker Beta 553 ideias Survival Futebol 81 ideias Taiwan Yahoo 名人 娛樂 0 ideias Taiwan Yahoo 運動 0 ideias Tailândia Safely 2 ideias Toolbar Mail App 216 ideas Toolbar Weather App 72 ideias Tourney Pick & # 39; em 41 ideias UK & amp; Irlanda Finanças 1.077 ideias UK & amp; Jogos da Irlanda 19 ideias UK & amp; Homepage da Irlanda 435 ideias UK & amp; Irlanda Notícias 0 ideias UK & amp; Ireland News Balde interno 0 ideias UK & amp; Irlanda Notícias Lego 375 ideas UK & amp; Irlanda com segurança 38 ideias UK & amp; Irlanda TV 21 ideias UK & amp; Irlanda Vídeo 187 ideias UK & amp; Irlanda Tempo 99 ideias Reino Unido Respostas 1 ideia UK Daily Fantasy 0 ideias UK Finanças Mobile Android 12 idéias UK Finanças Mobile DF iOS 2 idéias UK Finanças Mobile iOS 308 idéias Reino Unido Yahoo Movies 23 ideias US Respostas 8,946 ideias US Respostas Mobile Web 2,154 ideias US Autos GS 442 ideias US Celebrity GS 660 ideias EUA Comentários 350 ideias US Finance Mobile Android 40 idéias US Finance Mobile iOS 546 idéias US Flickr 236 ideias EUA 4,111 ideias EUA Homepage B1 68 ideias US Homepage B2 33 ideias US Homepage B3 50 ideias US Homepage B4 33 ideias US Homepage B5 0 ideias Página inicial dos EUA M 7,022 ideias Página inicial dos EUA YDC 43 idéias US Homes GS 203 idéias US Live Web Insights 24 ideias US Mail 193 ideias US Mail 12,211 ideias EUA Mapas 3,490 ideias US Membership Desktop 7,861 ideias US Membership Mobile 91 ideias US Filmes GS 424 ideias US Music GS 195 ideias US News 5,987 ideias US Search App Android 2 ideias US Search App iOS 9 ideias US Search Chrome Extensão 780 ideias US Chrome Chrome Extensão v2 2,197 ideias US Pesquisar Desktop 3 ideias US Search Desktop Bucket A 7 ideias US Search Desktop Bucket B 8 ideias US Pesquisar KG 2 ideias US Pesquisar Listagens locais 20,757 ideias EUA Busca Mobile Web 2 ideias EUA Busca Moçambique 1 ideia EUA Pesquisar estoque Quotes 11 ideias US Pesquisar Tablet Web 0 ideias US Shine GS 1 idéia US Toolbar 5,549 ideias US Travel GS 207 idéias EUA TV GS 367 ideias US Weather 2,313 ideias US Weather Bucket 0 ideias US Weather Mobile 13 idéias US Weather Mobile Android 2 ideias Guia de vídeos Android 149 idéias Guia de vídeo iOS 205 idéias Teste de guia de vídeo 15 ideias Web Hosting 4 ideias Yahoo Acessibilidade 358 ideias Yahoo Autos 71 ideias Yahoo Beauty 100 ideias Yahoo Ideias de celebridades 0 Yahoo Celebrity Canada 0 ideias Yahoo Decor 0 ideias Yahoo Entertainment 355 ideias Yahoo Esports 50 ideias Yahoo Feedback 0 ideias Yahoo Finance Feedback Forum 1 ideia Yahoo Finanças IN Mobile Android 0 ideias Yahoo Finance SG Mobile Android 1 idéia Yahoo FinanceReel 4 ideias Yahoo Comida 118 idéias Yahoo Gemini 2 ideias Yahoo Saúde 90 ideias Yahoo ajuda 166 idéias Yahoo Início 195 idéias Yahoo Home * 28 ideias Yahoo Lifestyle 168 idéias Ideias do Yahoo Yahoo 0 Yahoo Mail 2,124 ideias Aplicação do Yahoo Mail para Android 397 ideias Yahoo Mail Basic 627 ideias Yahoo Mail iOS App 47 idéias Yahoo Mail Mobile Web 1 idéia Yahoo Makers 51 idéias Yahoo Messenger 82 ideias Yahoo Mobile Developer Suite 60 idéias Yahoo Mobile para ideias do telefone 15 Yahoo Mobile para idéias do Tablet 0 Yahoo Music 76 idéias Yahoo News Digest Android 870 idéias Yahoo News Digest iPad 0 idéias Yahoo News Digest iPhone 1,531 idéias Yahoo Newsroom Aplicativo para Android 55 idéias Yahoo Newsroom iOS App 30 ideias Yahoo Parenting 63 ideias Yahoo Politics 118 idéias Yahoo Publishing 13 ideias Yahoo Real Estate 2 ideias Yahoo Tech 459 idéias Yahoo Travel 143 idéias Yahoo TV 102 ideias Yahoo View 204 ideias Yahoo Weather Android 2.138 ideias Yahoo Weather iOS 22.676 ideias Yahoo! 7 Food App (iOS) 0 ideias Yahoo! 7 Página inicial Archive 57 ideas Yahoo! 7 Notícias (iOS) 23 ideias Yahoo! 7 Tela 0 ideias Yahoo! 7 TV FANGO App (Android) 1 ideia Yahoo! 7 aplicação TV FANGO (iOS) 1 ideia Yahoo! 7 TV Guide App (Android) 0 ideias Yahoo! 7 TV Guide App (iOS) 1,245 ideias Yahoo! 7 Aplicação TV Plus7 (iOS) 0 ideias Yahoo! Centro de Feedback do Teste de Conceito 174 idéias Yahoo! Idéia de Contributor Network 1 Yahoo! Transliteração 29 ideias YAHOO! 7 Finanças 551 idéias Yahoo! 7 Jogos 9 ideias Yahoo! 7 Safely 19 ideias Yahoo7 Finanças Mobile DF iOS 12 ideias Yahoo7 Finanças Mobile iOS 217 ideias Yahoo7 Homepage 2.544 ideias.
Sua senha foi alterada.
Fizemos alterações para aumentar nossa segurança e restabelecer sua senha.
Acabamos de enviar-lhe um e-mail para. Clique no link para criar uma senha, depois volte aqui e faça o login.

Better Strategies 5: A Short-Term Machine Learning System.
It’s time for the 5th and final part of the Build Better Strategies series. In part 3 we’ve discussed the development process of a model-based system, and consequently we’ll conclude the series with developing a data-mining system. The principles of data mining and machine learning have been the topic of part 4. For our short-term trading example we’ll use a deep learning algorithm , a stacked autoencoder, but it will work in the same way with many other machine learning algorithms. With today’s software tools, only about 20 lines of code are needed for a machine learning strategy. I’ll try to explain all steps in detail.
Our example will be a research project – a machine learning experiment for answering two questions. Does a more complex algorithm – such as, more neurons and deeper learning – produce a better prediction? And are short-term price moves predictable by short-term price history? The last question came up due to my scepticism about price action trading in the previous part of this series. I got several emails asking about the “trading system generators” or similar price action tools that are praised on some websites. There is no hard evidence that such tools ever produced any profit (except for their vendors) – but does this mean that they all are garbage? We’ll see.
Our experiment is simple: We collect information from the last candles of a price curve, feed it in a deep learning neural net, and use it to predict the next candles. My hypothesis is that a few candles don’t contain any useful predictive information. Of course, a nonpredictive outcome of the experiment won’t mean that I’m right, since I could have used wrong parameters or prepared the data badly. But a predictive outcome would be a hint that I’m wrong and price action trading can indeed be profitable.
Machine learning strategy development.
Step 1: The target variable.
To recap the previous part: a supervised learning algorithm is trained with a set of features in order to predict a target variable . So the first thing to determine is what this target variable shall be. A popular target, used in most papers, is the sign of the price return at the next bar. Better suited for prediction, since less susceptible to randomness, is the price difference to a more distant prediction horizon , like 3 bars from now, or same day next week. Like almost anything in trading systems, the prediction horizon is a compromise between the effects of randomness (less bars are worse) and predictability (less bars are better).
Sometimes you’re not interested in directly predicting price, but in predicting some other parameter – such as the current leg of a Zigzag indicator – that could otherwise only be determined in hindsight. Or you want to know if a certain market inefficiency will be present in the next time, especially when you’re using machine learning not directly for trading, but for filtering trades in a model-based system. Or you want to predict something entirely different, for instance the probability of a market crash tomorrow. All this is often easier to predict than the popular tomorrow’s return.
In our price action experiment we’ll use the return of a short-term price action trade as target variable. Once the target is determined, next step is selecting the features.
Step 2: The features.
A price curve is the worst case for any machine learning algorithm. Not only does it carry little signal and mostly noise , it is also nonstationary and the signal/noise ratio changes all the time. The exact ratio of signal and noise depends on what is meant with “signal”, but it is normally too low for any known machine learning algorithm to produce anything useful. So we must derive features from the price curve that contain more signal and less noise. Signal, in that context, is any information that can be used to predict the target, whatever it is. All the rest is noise.
Thus, selecting the features is critical for success – much more critical than deciding which machine learning algorithm you’re going to use. There are two approaches for selecting features. The first and most common is extracting as much information from the price curve as possible. Since you do not know where the information is hidden, you just generate a wild collection of indicators with a wide range of parameters, and hope that at least a few of them will contain the information that the algorithm needs. This is the approach that you normally find in the literature. The problem of this method: Any machine learning algorithm is easily confused by nonpredictive predictors. So it won’t do to just throw 150 indicators at it. You need some preselection algorithm that determines which of them carry useful information and which can be omitted. Without reducing the features this way to maybe eight or ten, even the deepest learning algorithm won’t produce anything useful.
The other approach, normally for experiments and research, is using only limited information from the price curve. This is the case here: Since we want to examine price action trading, we only use the last few prices as inputs, and must discard all the rest of the curve. This has the advantage that we don’t need any preselection algorithm since the number of features is limited anyway. Here are the two simple predictor functions that we use in our experiment (in C):
The two functions are supposed to carry the necessary information for price action: per-bar movement and volatility. The change function is the difference of the current price to the price of n bars before, divided by the current price. The range function is the total high-low distance of the last n candles, also in divided by the current price. And the scale function centers and compresses the values to the +/-100 range, so we divide them by 100 for getting them normalized to +/-1 . We remember that normalizing is needed for machine learning algorithms.
Step 3: Preselecting/preprocessing predictors.
When you have selected a large number of indicators or other signals as features for your algorithm, you must determine which of them is useful and which not. There are many methods for reducing the number of features, for instance:
Determine the correlations between the signals. Remove those with a strong correlation to other signals, since they do not contribute to the information. Compare the information content of signals directly, with algorithms like information entropy or decision trees. Determine the information content indirectly by comparing the signals with randomized signals; there are some software libraries for this, such as the R Boruta package. Use an algorithm like Principal Components Analysis (PCA) for generating a new signal set with reduced dimensionality. Use genetic optimization for determining the most important signals just by the most profitable results from the prediction process. Great for curve fitting if you want to publish impressive results in a research paper.
For our experiment we do not need to preselect or preprocess the features, but you can find useful information about this in articles (1), (2), and (3) listed at the end of the page.
Step 4: Select the machine learning algorithm.
R offers many different ML packages, and any of them offers many different algorithms with many different parameters. Even if you already decided about the method – here, deep learning – you have still the choice among different approaches and different R packages. Most are quite new, and you can find not many empirical information that helps your decision. You have to try them all and gain experience with different methods. For our experiment we’ve choosen the Deepnet package, which is probably the simplest and easiest to use deep learning library. This keeps our code short. We’re using its Stacked Autoencoder ( SAE ) algorithm for pre-training the network. Deepnet also offers a Restricted Boltzmann Machine ( RBM ) for pre-training, but I could not get good results from it. There are other and more complex deep learning packages for R, so you can spend a lot of time checking out all of them.
How pre-training works is easily explained, but why it works is a different matter. As to my knowledge, no one has yet come up with a solid mathematical proof that it works at all. Anyway, imagine a large neural net with many hidden layers:
Training the net means setting up the connection weights between the neurons. The usual method is error backpropagation. But it turns out that the more hidden layers you have, the worse it works. The backpropagated error terms get smaller and smaller from layer to layer, causing the first layers of the net to learn almost nothing. Which means that the predicted result becomes more and more dependent of the random initial state of the weights. This severely limited the complexity of layer-based neural nets and therefore the tasks that they can solve. At least until 10 years ago.
In 2006 scientists in Toronto first published the idea to pre-train the weights with an unsupervised learning algorithm, a restricted Boltzmann machine. This turned out a revolutionary concept. It boosted the development of artificial intelligence and allowed all sorts of new applications from Go-playing machines to self-driving cars. In the case of a stacked autoencoder, it works this way:
Select the hidden layer to train; begin with the first hidden layer. Connect its outputs to a temporary output layer that has the same structure as the network’s input layer. Feed the network with the training samples, but without the targets. Train it so that the first hidden layer reproduces the input signal – the features – at its outputs as exactly as possible. The rest of the network is ignored. During training, apply a ‘weight penalty term’ so that as few connection weights as possible are used for reproducing the signal. Now feed the outputs of the trained hidden layer to the inputs of the next untrained hidden layer, and repeat the training process so that the input signal is now reproduced at the outputs of the next layer. Repeat this process until all hidden layers are trained. We have now a ‘sparse network’ with very few layer connections that can reproduce the input signals. Now train the network with backpropagation for learning the target variable, using the pre-trained weights of the hidden layers as a starting point.
The hope is that the unsupervised pre-training process produces an internal noise-reduced abstraction of the input signals that can then be used for easier learning the target. And this indeed appears to work. No one really knows why, but several theories – see paper (4) below – try to explain that phenomenon.
Step 5: Generate a test data set.
We first need to produce a data set with features and targets so that we can test our prediction process and try out parameters. The features must be based on the same price data as in live trading, and for the target we must simulate a short-term trade. So it makes sense to generate the data not with R, but with our trading platform, which is anyway a lot faster. Here’s a small Zorro script for this, DeepSignals. c :
We’re generating 2 years of data with features calculated by our above defined change and range functions. Our target is the result of a trade with 3 bars life time. Trading costs are set to zero, so in this case the result is equivalent to the sign of the price difference at 3 bars in the future. The adviseLong function is described in the Zorro manual; it is a mighty function that automatically handles training and predicting and allows to use any R-based machine learning algorithm just as if it were a simple indicator.
In our code, the function uses the next trade return as target, and the price changes and ranges of the last 4 bars as features. The SIGNALS flag tells it not to train the data, but to export it to a. csv file. The BALANCED flag makes sure that we get as many positive as negative returns; this is important for most machine learning algorithms. Run the script in [Train] mode with our usual test asset EUR/USD selected. It generates a spreadsheet file named DeepSignalsEURUSD_L. csv that contains the features in the first 8 columns, and the trade return in the last column.
Step 6: Calibrate the algorithm.
Complex machine learning algorithms have many parameters to adjust. Some of them offer great opportunities to curve-fit the algorithm for publications. Still, we must calibrate parameters since the algorithm rarely works well with its default settings. For this, here’s an R script that reads the previously created data set and processes it with the deep learning algorithm ( DeepSignal. r ):
We’ve defined three functions neural. train , neural. predict , and neural. init for training, predicting, and initializing the neural net. The function names are not arbitrary, but follow the convention used by Zorro’s advise(NEURAL. ) function. It doesn’t matter now, but will matter later when we use the same R script for training and trading the deep learning strategy. A fourth function, TestOOS , is used for out-of-sample testing our setup.
The function neural. init seeds the R random generator with a fixed value (365 is my personal lucky number). Otherwise we would get a slightly different result any time, since the neural net is initialized with random weights. It also creates a global R list named “Models”. Most R variable types don’t need to be created beforehand, some do (don’t ask me why). The ‘<<-‘ operator is for accessing a global variable from within a function.
The function neural. train takes as input a model number and the data set to be trained. The model number identifies the trained model in the “ Models ” Lista. A list is not really needed for this test, but we’ll need it for more complex strategies that train more than one model. The matrix containing the features and target is passed to the function as second parameter. If the XY data is not a proper matrix, which frequently happens in R depending on how you generated it, it is converted to one. Then it is split into the features ( X ) and the target ( Y ), and finally the target is converted to 1 for a positive trade outcome and 0 for a negative outcome.
The network parameters are then set up. Some are obvious, others are free to play around with:
The network structure is given by the hidden vector: c(50,100,50) defines 3 hidden layers, the first with 50, second with 100, and third with 50 neurons. That’s the parameter that we’ll later modify for determining whether deeper is better. The activation function converts the sum of neuron input values to the neuron output; most often used are sigmoid that saturates to 0 or 1, or tanh that saturates to -1 or +1.
We use tanh here since our signals are also in the +/-1 range. The output of the network is a sigmoid function since we want a prediction in the 0..1 range. But the SAE output must be “linear” so that the Stacked Autoencoder can reproduce the analog input signals on the outputs.
The learning rate controls the step size for the gradient descent in training; a lower rate means finer steps and possibly more precise prediction, but longer training time. Momentum adds a fraction of the previous step to the current one. It prevents the gradient descent from getting stuck at a tiny local minimum or saddle point. The learning rate scale is a multiplication factor for changing the learning rate after each iteration (I am not sure for what this is good, but there may be tasks where a lower learning rate on higher epochs improves the training). An epoch is a training iteration over the entire data set. Training will stop once the number of epochs is reached. More epochs mean better prediction, but longer training. The batch size is a number of random samples – a mini batch – taken out of the data set for a single training run. Splitting the data into mini batches speeds up training since the weight gradient is then calculated from fewer samples. The higher the batch size, the better is the training, but the more time it will take. The dropout is a number of randomly selected neurons that are disabled during a mini batch. This way the net learns only with a part of its neurons. This seems a strange idea, but can effectively reduce overfitting.
All these parameters are common for neural networks. Play around with them and check their effect on the result and the training time. Properly calibrating a neural net is not trivial and might be the topic of another article. The parameters are stored in the model together with the matrix of trained connection weights. So they need not to be given again in the prediction function, neural. predict . It takes the model and a vector X of features, runs it through the layers, and returns the network output, the predicted target Y . Compared with training, prediction is pretty fast since it only needs a couple thousand multiplications. If X was a row vector, it is transposed and this way converted to a column vector, otherwise the nn. predict function won’t accept it.
Use RStudio or some similar environment for conveniently working with R. Edit the path to the. csv data in the file above, source it, install the required R packages (deepnet, e1071, and caret), then call the TestOOS function from the command line. If everything works, it should print something like that:
TestOOS reads first our data set from Zorro’s Data folder. It splits the data in 80% for training ( XY. tr ) and 20% for out-of-sample testing ( XY. ts ). The training set is trained and the result stored in the Models list at index 1. The test set is further split in features ( X ) and targets ( Y ). Y is converted to binary 0 or 1 and stored in Y. ob , our vector of observed targets. We then predict the targets from the test set, convert them again to binary 0 or 1 and store them in Y. pr . For comparing the observation with the prediction, we use the confusionMatrix function from the caret package.
A confusion matrix of a binary classifier is simply a 2×2 matrix that tells how many 0’s and how many 1’s had been predicted wrongly and correctly. A lot of metrics are derived from the matrix and printed in the lines above. The most important at the moment is the 62% prediction accuracy . This may hint that I bashed price action trading a little prematurely. But of course the 62% might have been just luck. We’ll see that later when we run a WFO test.
A final advice: R packages are occasionally updated, with the possible consequence that previous R code suddenly might work differently, or not at all. This really happens, so test carefully after any update.
Step 7: The strategy.
Now that we’ve tested our algorithm and got some prediction accuracy above 50% with a test data set, we can finally code our machine learning strategy. In fact we’ve already coded most of it, we just must add a few lines to the above Zorro script that exported the data set. This is the final script for training, testing, and (theoretically) trading the system ( DeepLearn. c ):
We’re using a WFO cycle of one year, split in a 90% training and a 10% out-of-sample test period. You might ask why I have earlier used two year’s data and a different split, 80/20, for calibrating the network in step 5. This is for using differently composed data for calibrating and for walk forward testing. If we used exactly the same data, the calibration might overfit it and compromise the test.
The selected WFO parameters mean that the system is trained with about 225 days data, followed by a 25 days test or trade period. Thus, in live trading the system would retrain every 25 days, using the prices from the previous 225 days. In the literature you’ll sometimes find the recommendation to retrain a machine learning system after any trade, or at least any day. But this does not make much sense to me. When you used almost 1 year’s data for training a system, it can obviously not deteriorate after a single day. Or if it did, and only produced positive test results with daily retraining, I would strongly suspect that the results are artifacts by some coding mistake.
Training a deep network takes really a long time, in our case about 10 minutes for a network with 3 hidden layers and 200 neurons. In live trading this would be done by a second Zorro process that is automatically started by the trading Zorro. In the backtest, the system trains at any WFO cycle. Therefore using multiple cores is recommended for training many cycles in parallel. The NumCores variable at -1 activates all CPU cores but one. Multiple cores are only available in Zorro S, so a complete walk forward test with all WFO cycles can take several hours with the free version.
In the script we now train both long and short trades. For this we have to allow hedging in Training mode, since long and short positions are open at the same time. Entering a position is now dependent on the return value from the advise function, which in turn calls either the neural. train or the neural. predict function from the R script. So we’re here entering positions when the neural net predicts a result above 0.5.
The R script is now controlled by the Zorro script (for this it must have the same name, DeepLearn. r , only with different extension). It is identical to our R script above since we’re using the same network parameters. Only one additional function is needed for supporting a WFO test:
The neural. save function stores the Models list – it now contains 2 models for long and for short trades – after every training run in Zorro’s Data folder. Since the models are stored for later use, we do not need to train them again for repeated test runs.
This is the WFO equity curve generated with the script above (EUR/USD, without trading costs):
EUR/USD equity curve with 50-100-50 network structure.
Although not all WFO cycles get a positive result, it seems that there is some predictive effect. The curve is equivalent to an annual return of 89%, achieved with a 50-100-50 hidden layer structure. We’ll check in the next step how different network structures affect the result.
Since the neural. init , neural. train , neural. predict , and neural. save functions are automatically called by Zorro’s adviseLong/adviseShort functions, there are no R functions directly called in the Zorro script. Thus the script can remain unchanged when using a different machine learning method. Only the DeepLearn. r script must be modified and the neural net, for instance, replaced by a support vector machine. For trading such a machine learning system live on a VPS, make sure that R is also installed on the VPS, the needed R packages are installed, and the path to the R terminal set up in Zorro’s ini file. Otherwise you’ll get an error message when starting the strategy.
Step 8: The experiment.
If our goal had been developing a strategy, the next steps would be the reality check, risk and money management, and preparing for live trading just as described under model-based strategy development. But for our experiment we’ll now run a series of tests, with the number of neurons per layer increased from 10 to 100 in 3 steps, and 1, 2, or 3 hidden layers (deepnet does not support more than 3). So we’re looking into the following 9 network structures: c(10), c(10,10), c(10,10,10), c(30), c(30,30), c(30,30,30), c(100), c(100,100), c(100,100,100). For this experiment you need an afternoon even with a fast PC and in multiple core mode. Here are the results (SR = Sharpe ratio, R2 = slope linearity):
We see that a simple net with only 10 neurons in a single hidden layer won’t work well for short-term prediction. Network complexity clearly improves the performance, however only up to a certain point. A good result for our system is already achieved with 3 layers x 30 neurons. Even more neurons won’t help much and sometimes even produce a worse result. This is no real surprise, since for processing only 8 inputs, 300 neurons can likely not do a better job than 100.
Conclusão.
Our goal was determining if a few candles can have predictive power and how the results are affected by the complexity of the algorithm. The results seem to suggest that short-term price movements can indeed be predicted sometimes by analyzing the changes and ranges of the last 4 candles. The prediction is not very accurate – it’s in the 58%..60% range, and most systems of the test series become unprofitable when trading costs are included. Still, I have to reconsider my opinion about price action trading. The fact that the prediction improves with network complexity is an especially convincing argument for short-term price predictability.
It would be interesting to look into the long-term stability of predictive price patterns. For this we had to run another series of experiments and modify the training period ( WFOPeriod in the script above) and the 90% IS/OOS split. This takes longer time since we must use more historical data. I have done a few tests and found so far that a year seems to be indeed a good training period. The system deteriorates with periods longer than a few years. Predictive price patterns, at least of EUR/USD, have a limited lifetime.
Where can we go from here? There’s a plethora of possibilities, for instance:
Use inputs from more candles and process them with far bigger networks with thousands of neurons. Use oversampling for expanding the training data. Prediction always improves with more training samples. Compress time series f. i. with spectal analysis and analyze not the candles, but their frequency representation with machine learning methods. Use inputs from many candles – such as, 100 – and pre-process adjacent candles with one-dimensional convolutional network layers. Use recurrent networks. Especially LSTM could be very interesting for analyzing time series – and as to my knowledge, they have been rarely used for financial prediction so far. Use an ensemble of neural networks for prediction, such as Aronson’s “oracles” and “comitees”.
Papers / Articles.
(3) V. Perervenko, Selection of Variables for Machine Learning.
I’ve added the C and R scripts to the 2016 script repository. You need both in Zorro’s Strategy folder. Zorro version 1.474, and R version 3.2.5 (64 bit) was used for the experiment, but it should also work with other versions.
73 thoughts on “Better Strategies 5: A Short-Term Machine Learning System”
I’ve tested your strategy using 30min AAPL data but “sae. dnn. train” returns all NaN in training.
(It works just decreasing neurons to less than (5,10,5)… but accuracy is 49%)
Can you help me to understand why?
Desde já, obrigado.
If you have not changed any SAE parameters, look into the. csv data. It is then the only difference to the EUR/USD test. Maybe something is wrong with it.
Another fantastic article, jcl. Zorro is a remarkable environment for these experiments. Thanks for sharing your code and your approach – this really opens up an incredible number of possibilities to anyone willing to invest the time to learn how to use Zorro.
The problem with AAPL 30min data was related to the normalizing method I used (X-mean/SD).
The features range was not between -1:1 and I assume that sae. dnn need it to work…
Anyway performances are not comparable to yours 🙂
I have one question:
why do you use Zorro for creating the features in the csv file and then opening it in R?
why not create the file with all the features in R in a few lines and do the training on the file when you are already in R? instead of getting inside Zorro and then to R.
When you want R to create the features, you must still transmit the price data and the targets from Zorro to R. So you are not gaining much. Creating the features in Zorro results usually in shorter code and faster training. Features in R make only sense when you need some R package for calculating them.
Really helpful and interesting article! I would like to know if there are any English version of the book:
“Das Börsenhackerbuch: Finanziell unabhängig durch algorithmische Handelssysteme”
I am really interested on it,
Not yet, but an English version is planned.
Thanks JCL! Please let me now when the English version is ready, because I am really interested on it.
Works superbly (as always). Muito Obrigado. One small note, if you have the package “dlm” loaded in R, TestOOS will fail with error: “Error in TestOOS() : cannot change value of locked binding for ‘X'”. This is due to there being a function X in the dlm package, so the name is locked when the package is loaded. Easily fixed by either renaming occurrences of the variable X to something else, or temporarily detaching the dlm package with: detach(“package:dlm”, unload=TRUE)
Thanks for the info with the dlm package. I admit that ‘X’ is not a particular good name for a variable, but a function named ‘X’ in a distributed package is even a bit worse.
Results below were generated by revised version of DeepSignals. r – only change was use of LSTM net from the rnn package on CRAN. The authors of the package regard their LSTM implementation as “experimental” and do not feel it is as yet learning properly, so hopefully more improvement to come there. (Spent ages trying to accomplish the LSTM element using the mxnet package but gave up as couldn’t figure out the correct input format when using multiple training features.)
Will post results of full WFO when I have finished LSTM version of DeepLearn. r.
Confusion Matrix and Statistics.
95% CI : (0.5699, 0.5956)
No Information Rate : 0.5002.
P-Value [Acc > NIR] : <2e-16.
Mcnemar's Test P-Value : 0.2438.
Pos Pred Value : 0.5844.
Neg Pred Value : 0.5813.
Detection Rate : 0.2862.
Detection Prevalence : 0.4897.
Balanced Accuracy : 0.5828.
Results of WFO test below. Again, only change to original files was the use of LSTM in R, rather than DNN+SAE.
Walk-Forward Test DeepLearnLSTMV4 EUR/USD.
Simulated account AssetsFix.
Bar period 1 hour (avg 87 min)
Simulation period 15.05.2014-07.06.2016 (12486 bars)
Test period 04.05.2015-07.06.2016 (6649 bars)
Lookback period 100 bars (4 days)
WFO test cycles 11 x 604 bars (5 weeks)
Training cycles 12 x 5439 bars (46 weeks)
Monte Carlo cycles 200.
Assumed slippage 0.0 sec.
Spread 0.0 pips (roll 0.00/0.00)
Contracts per lot 1000.0.
Gross win/loss 3628$ / -3235$ (+5199p)
Average profit 360$/year, 30$/month, 1.38$/day.
Max drawdown -134$ 34% (MAE -134$ 34%)
Total down time 95% (TAE 95%)
Max down time 5 weeks from Aug 2015.
Max open margin 40$
Max open risk 35$
Trade volume 5710964$ (5212652$/year)
Transaction costs 0.00$ spr, 0.00$ slp, 0.00$ rol.
Capital required 262$
Number of trades 6787 (6195/year, 120/week, 25/day)
Percent winning 57.6%
Max win/loss 16$ / -14$
Avg trade profit 0.06$ 0.8p (+12.3p / -14.8p)
Avg trade slippage 0.00$ 0.0p (+0.0p / -0.0p)
Avg trade bars 1 (+1 / -2)
Max trade bars 3 (3 hours)
Time in market 177%
Max open trades 3.
Max loss streak 17 (uncorrelated 11)
Annual return 137%
Profit factor 1.12 (PRR 1.08)
Sharpe ratio 1.79.
Kelly criterion 2.34.
R2 coefficient 0.435.
Ulcer index 13.3%
Prediction error 152%
Confidence level AR DDMax Capital.
Portfolio analysis OptF ProF Win/Loss Wgt% Cycles.
EUR/USD .219 1.12 3907/2880 100.0 XX/\//\X///
EUR/USD:L .302 1.17 1830/1658 65.0 /\/\//\////
EUR/USD:S .145 1.08 2077/1222 35.0 \//\//\\///
Interessante! For a still experimental LSTM implementation that result looks not bad.
Sorry for being completely off topic but could you please point me to the best place where i can learn to code trend lines?? I’m a complete beginner, but from trading experience i see them as an important part of what i would like to build…
Robot Wealth has an algorithmic trading course for that – you can find details on his blog robotwealth/.
I think you misunderstand the meaning pretrening. See my articles mql5/ru/articles/1103.
I think there is more fully described this stage.
I don’t think I misunderstood pretraining, at least not more than everyone else, but thanks for the links!
You can paste your LTSM r code please ?
Could you help me answering some questions?
I have few question below:
1.I want to test Commission mode.
If I use interactive broker, I should set Commission = ? in normal case.
2.If I press the “trade” button, I see the log the script will use DeepLearn_EURUSD. ml.
So real trade it will use DeepLearn_EURUSD. ml to get the model to trade?
And use neural. predict function to trade?
3.If I use the slow computer to train the data ,
I should move DeepLearn_EURUSD. ml to the trade computer?
I test the real trade on my interactive brokers and press the result button.
Can I use Commission=0.60 to train the neural and get the real result?
Result button will show the message below:
Trade Trend EUR/USD.
Bar period 2 min (avg 2 min)
Trade period 02.11.2016-02.11.2016.
Spread 0.5 pips (roll -0.02/0.01)
Contracts per lot 1000.0.
Commission should be normally not set up in the script, but entered in the broker specific asset list. Otherwise you had to change the script every time when you want to test it with a different broker or account. IB has different lot sizes and commissions, so you need to add the command.
to the script when you want to test it for an IB account.
Yes, DeepLearn_EURUSD. ml is the model for live trading, and you need to copy it to the trade computer.
Do I write assetList(“AssetsIB. csv”) in the right place?
So below code’s result includes Commission ?
I test the result with Commission that seems pretty good.
Annual +93% +3177p.
BarPeriod = 60; // 1 hour.
WFOPeriod = 252*24; // 1 year.
NumCores = -1; // use all CPU cores but one.
Spread = RollLong = RollShort = Commission = Slippage = 0;
if(Train) Hedge = 2;
I run the DeepLearn. c in the IB paper trade.
The code “LifeTime = 3; // prediction horizon” seems to close the position that you open after 3 bars(3 hours).
But I can’t see it close the position on third bar close.
I see the logs below:
Closing prohibited – check NFA flag!
[EUR/USD::L4202] Can’t close 11.10995 at 09:10:51.
In my IB paper trade, it the default order size is 1k on EUR/USD.
How to change the order size in paper trade?
Muito obrigado.
IB is an NFA compliant broker. You can not close trades on NFA accounts. You must set the NFA flag for opening a reverse position instead. And you must enable trading costs, otherwise including the commission has no effect. I don’t think that you get a positive result with trading costs.
Those account issues are not related to machine learning, and are better asked on the Zorro forum. Or even better, read the Zorro manual where all this is explained. Just search for “NFA”.
I do some experiment to change the neural’s parameter with commission.
The code is below:
BarPeriod = 60; // 1 hour.
WFOPeriod = 252*24; // 1 year.
NumCores = -1; // use all CPU cores but one.
Spread = RollLong = RollShort = Slippage = 0;
if(Train) Hedge = 2;
I get the result with commission that Annual Return is about +23%.
But I don’t complete understand the zorro’s setting and zorro’s report.
Walk-Forward Test DeepLearn EUR/USD.
Simulated account AssetsIB. csv.
Bar period 1 hour (avg 86 min)
Simulation period 15.05.2014-09.09.2016 (14075 bars)
Test period 23.04.2015-09.09.2016 (8404 bars)
Lookback period 100 bars (4 days)
WFO test cycles 14 x 600 bars (5 weeks)
Training cycles 15 x 5401 bars (46 weeks)
Monte Carlo cycles 200.
Simulation mode Realistic (slippage 0.0 sec)
Spread 0.0 pips (roll 0.00/0.00)
Contracts per lot 20000.0.
Gross win/loss 24331$ / -22685$ (+914p)
Average profit 1190$/year, 99$/month, 4.58$/day.
Max drawdown -1871$ 114% (MAE -1912$ 116%)
Total down time 92% (TAE 41%)
Max down time 18 weeks from Dec 2015.
Max open margin 2483$
Max open risk 836$
Trade volume 26162350$ (18916130$/year)
Transaction costs 0.00$ spr, 0.00$ slp, 0.00$ rol, -1306$ com.
Capital required 5239$
Number of trades 1306 (945/year, 19/week, 4/day)
Percent winning 52.5%
Max win/loss 375$ / -535$
Avg trade profit 1.26$ 0.7p (+19.7p / -20.3p)
Avg trade slippage 0.00$ 0.0p (+0.0p / -0.0p)
Avg trade bars 2 (+2 / -3)
Max trade bars 3 (3 hours)
Time in market 46%
Max open trades 3.
Max loss streak 19 (uncorrelated 10)
Annual return 23%
Profit factor 1.07 (PRR 0.99)
Sharpe ratio 0.56.
Kelly criterion 1.39.
R2 coefficient 0.000.
Ulcer index 20.8%
Confidence level AR DDMax Capital.
10% 29% 1134$ 4153$
20% 27% 1320$ 4427$
30% 26% 1476$ 4656$
40% 24% 1649$ 4911$
50% 23% 1767$ 5085$
60% 22% 1914$ 5301$
70% 21% 2245$ 5789$
80% 19% 2535$ 6216$
90% 16% 3341$ 7403$
95% 15% 3690$ 7917$
100% 12% 4850$ 9625$
Portfolio analysis OptF ProF Win/Loss Wgt% Cycles.
EUR/USD .256 1.07 685/621 100.0 /X/XXXXXXXXXXX.
The manual is your friend:
Great read…I built this framework to use XGB to analyze live ETF price movements. Let me know what you think:
Hi, deep learning researcher and programmer here. 🙂
Great blog and great article, congratulations! I have some comments:
& # 8211; if you use ReLUs as activation functions, pretraining is not necessary.
& # 8211; AE is genarraly referred to as networks with same input and output, I would call the proposed network rather a MLP (multi-layer perceptron).
Do you think it is possible to use Python (like TensorFlow) or LUA (like Torch7) based deep learing libraries with Zorro?
I have also heard that ReLUs make a network so fast that you can brute force train it in some cases, with no pretraining. But I have not yet experimented with that. The described network is commonly called ‘SAE’ since it uses autoencoders, with indeed the same number of inputs and outputs, for the pre-training process. & # 8211; I am not familiar with Torch7, but you can theoretically use Tensorflow with Zorro with a DLL based interface. The network structure must still be defined in Python, but Zorro can use the network for training and prediction.
Would you do YouTube Tutorials to your work, this series of articles. And where can I subscribe this kinda of algorithmic trading tutorials. Thanks for your contribution.
I would do YouTube tutorials if someone payed me very well for them. Until then, you can subscribe this blog with the link on the right above.
Why not feed economic data from a calendar like forexfactory into the net as well? I suggested that several times before. This data is what makes me a profitable manual trader (rookie though), if there is any intelligence in these neuronal networks it should improve performance greatly. input must be name (non farm payrolls for example or some unique identifier) , time left to release, predicted value (like 3-5 days before) last value and revision. Some human institutional traders claim its possible to trade profitably without a chart from this data alone. Detecting static support and resistance areas (horizontal lines) should be superior to any simple candle patterns. It can be mathematically modeled, as the Support and Resistance indicator from Point Zero Trading proves. Unfortunately i dont have a clue how Arturo the programmer did it. I imagine an artificial intelligence actually “seeing” what the market is focussed on (like speculation on a better than expected NFP report based on other positive Data in the days before, driving the dollar up into the report). “seeing” significant support and resistance levels should allow for trading risk, making reasonable decisions on where to place SL and TP.
We also made the experience that well chosen external data, not derived from the price curve, can improve the prediction. There is even a trading system based on Trump’s twitter outpourings. I can’t comment on support and resistance since I know no successful systems that use them, and am not sure that they exist at all.
thank you very much for everything that you did so far.
I read the book (German here, too) and am working through your blog articles right now.
I already learnt a lot and still am learning more and more about the really important stuff (other than: Your mindset must be perfect and you need to have well-defined goals. I never was a fan of such things and finally I found someone that is on the same opinion and actually teaches people how to correctly do it).
So, thank you very much and thanks in advance for all upcoming articles that I will read and you will post.
As a thank you I was thinking about sending you a corrected version of your book (there are some typos and wrong articles here and there…). Would you be interested in that?
Again thank you for everything and please keep up the good work.
Obrigado! And I’m certainly interested in a list of all my mistakes.
Thank you for this interesting post. I ran it on my pc and obtained similar results as yours. Then I wanted to see if it could perform as well when commission and rollover and slippage were included during test. I used the same figures as the ones used in the workshops and included in the AssetFix. csv file. The modifications I did in your DeepLearn. c file are as follows:
Spread = RollLong = RollShort = Commission = Slippage = 0;
The results then were not as optimistic as without commission:
Walk-Forward Test DeepLearn_realistic EUR/USD.
Simulated account AssetsFix.
Bar period 1 hour (avg 86 min)
Simulation period 09.05.2014-27.01.2017 (16460 bars)
Test period 22.04.2015-27.01.2017 (10736 bars)
Lookback period 100 bars (4 days)
WFO test cycles 18 x 596 bars (5 weeks)
Training cycles 19 x 5367 bars (46 weeks)
Monte Carlo cycles 200.
Simulation mode Realistic (slippage 5.0 sec)
Spread 0.5 pips (roll -0.02/0.01)
Contracts per lot 1000.0.
Gross win/loss 5608$ / -6161$ (-6347p)
Average profit -312$/year, -26$/month, -1.20$/day.
Max drawdown -635$ -115% (MAE -636$ -115%)
Total down time 99% (TAE 99%)
Max down time 85 weeks from Jun 2015.
Max open margin 40$
Max open risk 41$
Trade volume 10202591$ (5760396$/year)
Transaction costs -462$ spr, 46$ slp, -0.16$ rol, -636$ com.
Capital required 867$
Number of trades 10606 (5989/year, 116/week, 24/day)
Percent winning 54.9%
Max win/loss 18$ / -26$
Avg trade profit -0.05$ -0.6p (+11.1p / -14.8p)
Avg trade slippage 0.00$ 0.0p (+1.5p / -1.7p)
Avg trade bars 1 (+1 / -2)
Max trade bars 3 (3 hours)
Time in market 188%
Max open trades 3.
Max loss streak 19 (uncorrelated 12)
Annual return -36%
Profit factor 0.91 (PRR 0.89)
Sharpe ratio -1.39.
Kelly criterion -5.39.
R2 coefficient 0.737.
Ulcer index 100.0%
Confidence level AR DDMax Capital.
Portfolio analysis OptF ProF Win/Loss Wgt% Cycles.
EUR/USD .000 0.91 5820/4786 100.0 XX/\XX\X\X/X/\\X\\
I am a very beginner with Zorro, maybe I did a mistake ? O que você acha ?
No, your results look absolutely ok. The predictive power of 4 candles is very weak. This is just an experiment for finding out if price action has any predictive power at all.
Although it apparently has, I have not yet seen a really profitable system with this method. From the machine learning systems that we’ve programmed so far, all that turned out profitable used data from a longer price history.
Thank you for the great article, it’s exactly what I needed in order to start experimenting with ML in Zorro.
I’ve noticed that the results are slightly different each time despite using the random seed. Here it doesn’t matter thanks to the large number of trades but for example with daily bars the performance metrics fluctuate much more. My question is: do you happen to know from where does the randomness come? Is it still the training process in R despite the seed?
It is indeed so. Deepnet apparently uses also an internal function, not only the R random function, for randomizing some initial value.
any idea about how to use machine learning like in this example with indicators? you could do as better strategy 6.
would be very interesting.
Is it grid search inside the neural. train function allowed? I get error when I try it.
Besides Andy, how did you end up definining the LSTM structure using rnn? Is it not clear for me after reading inside the package.
where is the full code?(or where is the repository?)
You said” Use genetic optimization for determining the most important signals just by the most profitable results from the prediction process. Great for curve fitting” How about after using genetic optimization process for determining the most profitable signals , match and measure the most profitable signals with distance metrics/similarity analysis(mutual information, DTW, frechet distance algorithm etc…) then use the distance metrics/similarity analysis as function for neural network prediction? Isso faz sentido ?
Distance to what? To each other?
Yes find similar profitable signal-patterns in history and find distance between patterns/profitable signals then predict the behavior of the profitable signal in the future from past patterns.
Was wondering about this point you made in Step 5:
“Our target is the return of a trade with 3 bars life time.”
But in the code, doesn’t.
mean that we are actually predicting the SIGN of the return, rather than the return itself?
Sim. Only the binary win/loss result, but not the magnitude of the win or loss is used for the prediction.
“When you used almost 1 year’s data for training a system, it can obviously not deteriorate after a single day. Or if it did, and only produced positive test results with daily retraining, I would strongly suspect that the results are artifacts by some coding mistake.”
There is an additional trap to be aware of related to jcl’s comment above that applies to supervised machine learning techniques (where you train a model against actual outcomes). Assume you are trying to predict the return three bars ahead (as in the example above – LifeTime = 3;). In real time you obviously don’t have access to the outcomes for one, two and three bars ahead with which to retrain your model, but when using historical data you do. With frequently retrained models (especially if using relatively short blocks of training data) it is easy to train a model offline (and get impressive results) with data you will not have available for training in real time. Then reality kicks in. Therefore truncating your offline training set by N bars (where N is the number of bars ahead you are trying to predict) may well be advisable…
Amazing work, could you please share the WFO code as well. I was able to run the code till neural. save but unable to generate the WFO results.
Muito obrigado.
The code above does use WFO.
Dear jcl, in the text you mentioned that you could predict the current leg of zig-zag indicator, could you please elaborate on how to do that? what features and responses would you reccomend?
I would never claim that I could predict the current leg of zigzag indicator. But we have indeed coded a few systems that attempted that. For this, simply use not the current price movement, but the current zigzag slope as a training target. Which parameters you use for the features is completely up to you.
Bom trabalho. I was wondering if you ever tried using something like a net long-short ratio of the asset (I. e. the FXCM SSI index – real time live data) as a feature to improve prediction?
Not with the FXCM SSI index, since it is not available as historical data as far as I know. But similar data of other markets, such as order book content, COT report or the like, have been used as features to a machine learning system.
I see, thanks, and whats’s the experience on those? do they have any predictive power? if you know any materials on this, I would be very interested to read it. (fyi, the SSI index can be exported from FXCM Trading Station (daily data from 2003 for most currency pairs)
Thanks for the info with the SSI. Yes, additional market data can have predictive power, especially from the order book. But since we gathered this experience with contract work for clients, I’m not at liberty to disclose details. However we plan an own study with ML evaluation of additional data, and that might result in an article on this blog.
Thanks jcl, looking forward to it! there is a way to record SSI ratios in a CSV file from a LUA Strategy script (FXCM’s scripting language) for live evaluation. happy to give you some details if you decide to evaluate this. (drop me an email) MyFxbook also has a similar indicator, but no historical data on that one unfortunately.
Does random forest algorithm have any advantage over deep net or neural networks for classification problems in financial data? I make it more clear ; I use number of moving averages and oscillators slope colour change for trading decision(buy - sell-hold).Sometimes one oscillator colour change is lagging other is faster etc..There is no problem at picking tops and bottoms but It is quite challenging to know when to hold. Since random forest doesnt’ need normalization, do they have any advantage over deep net or neural networks for classification? Thanks.
This depends on the system and the features, so there is no general answer. In the systems we did so far, a random forest or single decision tree was sometimes indeed better than a standard neural network, but a deep network beats anything, especially since you need not care as much about feature preselection. We meanwhile do most ML systems with deep networks.
I see thank you. I have seen some new implementations of LSTM which sounds interesting. One is called phased LSTM another one is from Yarin Gaal. He is using Bayesian technique(gaussian process) as dropout cs. ox. ac. uk/people/yarin. gal/website/blog_2248.html.
I hooked up the news flow from forexfactory into this algo and predictive power has improved by 7%.
I downloaded forexfactory news history from 2010. Used a algo to convert that into a value of -1 to 1 for EUR. This value becomes another parameter into the neural training network. I think there is real value there …let me see if we can get the win ratio to 75% and then I thik we have a real winner on hands here. & # 8230; ..
The neural training somehow only yields results with EURUSD.
Anyone tried GBPUSD or EURJPY.
That’s also my experience. There are only a few asset types with which price pattern systems seem to really work, and that’s mainly EUR/USD and some cryptos. We also had pattern systems with GBP/USD und USD/JPY, but they work less well and need more complex algos. Most currencies don’t expose patterns at all.
JCL, you are saying “The R script is now controlled by the Zorro script (for this it must have the same name, NeuralLearn. r, only with different extension).”
…same name as what ? Shouldn’t it say DeepLearn. r (instead of NeuralLearn. r) ? Where is the name “NeuralLearn” coming from, we don’t seem to have used it anywhere else. Sorry I am not sure what I am missing here, could you please clarify?
That’s right, DeepLearn. r it is. That was a wrong name in the text. The files in the repository should be correctly named.
Thanks for your reply jcl, much appreciated.
I love your work. And I have got lots to learn.
Hope you don’t mind me asking another question …

Better Strategies 4: Machine Learning.
Deep Blue was the first computer that won a chess world championship. That was 1996, and it took 20 years until another program, AlphaGo , could defeat the best human Go player. Deep Blue was a model based system with hardwired chess rules. AlphaGo is a data-mining system, a deep neural network trained with thousands of Go games. Not improved hardware, but a breakthrough in software was essential for the step from beating top Chess players to beating top Go players.
In this 4th part of the mini-series we’ll look into the data mining approach for developing trading strategies. This method does not care about market mechanisms. It just scans price curves or other data sources for predictive patterns. Machine learning or “Artificial Intelligence” is not always involved in data-mining strategies. In fact the most popular – and surprisingly profitable – data mining method works without any fancy neural networks or support vector machines.
Machine learning principles.
A learning algorithm is fed with data samples , normally derived in some way from historical prices. Each sample consists of n variables x 1 .. x n , commonly named predictors , features , signals , or simply input . These predictors can be the price returns of the last n bars, or a collection of classical indicators, or any other imaginable functions of the price curve (I’ve even seen the pixels of a price chart image used as predictors for a neural network!). Each sample also normally includes a target variable y , like the return of the next trade after taking the sample, or the next price movement. In the literature you can find y also named label or objective . In a training process , the algorithm learns to predict the target y from the predictors x 1 .. x n . The learned ‘memory’ is stored in a data structure named model that is specific to the algorithm (not to be confused with a financial model for model based strategies!). A machine learning model can be a function with prediction rules in C code, generated by the training process. Or it can be a set of connection weights of a neural network.
The predictors, features, or whatever you call them, must carry information sufficient to predict the target y with some accuracy. They m ust also often fulfill two formal requirements. First, all predictor values should be in the same range, like -1 .. +1 (for most R algorithms) or -100 .. +100 (for Zorro or TSSB algorithms). So you need to normalize them in some way before sending them to the machine. Second, the samples should be balanced , i. e. equally distributed over all values of the target variable. So there should be about as many winning as losing samples. If you do not observe these two requirements, you’ll wonder why you’re getting bad results from the machine learning algorithm.
Regression algorithms predict a numeric value, like the magnitude and sign of the next price move. Classification algorithms predict a qualitative sample class, for instance whether it’s preceding a win or a loss. Some algorithms, such as neural networks, decision trees, or support vector machines, can be run in both modes.
A few algorithms learn to divide samples into classes without needing any target y . That’s unsupervised learning , as opposed to supervised learning using a target. Somewhere inbetween is reinforcement learning , where the system trains itself by running simulations with the given features, and using the outcome as training target. AlphaZero, the successor of AlphaGo, used reinforcement learning by playing millions of Go games against itself. In finance there are few applications for unsupervised or reinforcement learning. 99% of machine learning strategies use supervised learning.
Whatever signals we’re using for predictors in finance, they will most likely contain much noise and little information, and will be nonstationary on top of it. Therefore financial prediction is one of the hardest tasks in machine learning. More complex algorithms do not necessarily achieve better results. The selection of the predictors is critical to the success. It is no good idea to use lots of predictors, since this simply causes overfitting and failure in out of sample operation. Therefore data mining strategies often apply a preselection algorithm that determines a small number of predictors out of a pool of many. The preselection can be based on correlation between predictors, on significance, on information content, or simply on prediction success with a test set. Practical experiments with feature selection can be found in a recent article on the Robot Wealth blog.
Here’s a list of the most popular data mining methods used in finance.
1. Indicator soup.
Most trading systems we’re programming for clients are not based on a financial model. The client just wanted trade signals from certain technical indicators, filtered with other technical indicators in combination with more technical indicators. When asked how this hodgepodge of indicators could be a profitable strategy, he normally answered: “Trust me. I’m trading it manually, and it works.”
It did indeed. At least sometimes. Although most of those systems did not pass a WFA test (and some not even a simple backtest), a surprisingly large number did. And those were also often profitable in real trading. The client had systematically experimented with technical indicators until he found a combination that worked in live trading with certain assets. This way of trial-and-error technical analysis is a classical data mining approach, just executed by a human and not by a machine. I can not really recommend this method – and a lot of luck, not to speak of money, is probably involved – but I can testify that it sometimes leads to profitable systems.
2. Candle patterns.
Not to be confused with those Japanese Candle Patterns that had their best-before date long, long ago. The modern equivalent is price action trading . You’re still looking at the open, high, low, and close of candles. You’re still hoping to find a pattern that predicts a price direction. But you’re now data mining contemporary price curves for collecting those patterns. There are software packages for that purpose. They search for patterns that are profitable by some user-defined criterion, and use them to build a specific pattern detection function. It could look like this one (from Zorro’s pattern analyzer):
This C function returns 1 when the signals match one of the patterns, otherwise 0. You can see from the lengthy code that this is not the fastest way to detect patterns. A better method, used by Zorro when the detection function needs not be exported, is sorting the signals by their magnitude and checking the sort order. An example of such a system can be found here.
Can price action trading really work? Just like the indicator soup, it’s not based on any rational financial model. One can at best imagine that sequences of price movements cause market participants to react in a certain way, this way establishing a temporary predictive pattern. However the number of patterns is quite limited when you only look at sequences of a few adjacent candles. The next step is comparing candles that are not adjacent, but arbitrarily selected within a longer time period. This way you’re getting an almost unlimited number of patterns – but at the cost of finally leaving the realm of the rational. It is hard to imagine how a price move can be predicted by some candle patterns from weeks ago.
Still, a lot effort is going into that. A fellow blogger, Daniel Fernandez, runs a subscription website (Asirikuy) specialized on data mining candle patterns. He refined pattern trading down to the smallest details, and if anyone would ever achieve any profit this way, it would be him. But to his subscribers’ disappointment, trading his patterns live (QuriQuant) produced very different results than his wonderful backtests. If profitable price action systems really exist, apparently no one has found them yet.
3. Linear regression.
The simple basis of many complex machine learning algorithms: Predict the target variable y by a linear combination of the predictors x 1 .. x n .
The coefficients a n are the model. They are calculated for minimizing the sum of squared differences between the true y values from the training samples and their predicted y from the above formula:
For normal distributed samples, the minimizing is possible with some matrix arithmetic, so no iterations are required. In the case n = 1 – with only one predictor variable x – the regression formula is reduced to.
which is simple linear regression , as opposed to multivariate linear regression where n > 1. Simple linear regression is available in most trading platforms, f. i. with the LinReg indicator in the TA-Lib. With y = price and x = time it’s often used as an alternative to a moving average. Multivariate linear regression is available in the R platform through the lm(..) function that comes with the standard installation. A variant is polynomial regression . Like simple regression it uses only one predictor variable x , but also its square and higher degrees, so that x n == x n :
With n = 2 or n = 3 , polynomial regression is often used to predict the next average price from the smoothed prices of the last bars. The polyfit function of MatLab, R, Zorro, and many other platforms can be used for polynomial regression.
4. Perceptron.
Often referred to as a neural network with only one neuron. In fact a perceptron is a regression function like above, but with a binary result, thus called logistic regression . It’s not regression though, it’s a classification algorithm. Zorro’s advise(PERCEPTRON, …) function generates C code that returns either 100 or -100, dependent on whether the predicted result is above a threshold or not:
You can see that the sig array is equivalent to the features x n in the regression formula, and the numeric factors are the coefficients a n .
5. N eural networks.
Linear or logistic regression can only solve linear problems. Many do not fall into this category – a famous example is predicting the output of a simple XOR function. And most likely also predicting prices or trade returns. An artificial neural network (ANN) can tackle nonlinear problems. It’s a bunch of perceptrons that are connected together in an array of layers . Any perceptron is a neuron of the net. Its output goes to the inputs of all neurons of the next layer, like this:
Like the perceptron, a neural network also learns by determining the coefficients that minimize the error between sample prediction and sample target. But this requires now an approximation process, normally with backpropagating the error from the output to the inputs, optimizing the weights on its way. This process imposes two restrictions. First, the neuron outputs must now be continuously differentiable functions instead of the simple perceptron threshold. Second, the network must not be too deep – it must not have too many ‘hidden layers’ of neurons between inputs and output. This second restriction limits the complexity of problems that a standard neural network can solve.
When using a neural network for predicting trades, you have a lot of parameters with which you can play around and, if you’re not careful, produce a lot of selection bias :
Number of hidden layers Number of neurons per hidden layer Number of backpropagation cycles, named epochs Learning rate, the step width of an epoch Momentum, an inertia factor for the weights adaption Activation function.
The activation function emulates the perceptron threshold. For the backpropagation you need a continuously differentiable function that generates a ‘soft’ step at a certain x value. Normally a sigmoid , tanh , or softmax function is used. Sometimes it’s also a linear function that just returns the weighted sum of all inputs. In this case the network can be used for regression, for predicting a numeric value instead of a binary outcome.
Neural networks are available in the standard R installation ( nnet , a single hidden layer network) and in many packages, for instance RSNNS and FCNN4R .
6. Deep learning.
Deep learning methods use neural networks with many hidden layers and thousands of neurons, which could not be effectively trained anymore by conventional backpropagation. Several methods became popular in the last years for training such huge networks. They usually pre-train the hidden neuron layers for achieving a more effective learning process. A Restricted Boltzmann Machine ( RBM ) is an unsupervised classification algorithm with a special network structure that has no connections between the hidden neurons. A Sparse Autoencoder ( SAE ) uses a conventional network structure, but pre-trains the hidden layers in a clever way by reproducing the input signals on the layer outputs with as few active connections as possible. Those methods allow very complex networks for tackling very complex learning tasks. Such as beating the world’s best human Go player.
Deep learning networks are available in the deepnet and darch R packages. Deepnet provides an autoencoder, Darch a restricted Boltzmann machine. I have not yet experimented with Darch, but here’s an example R script using the Deepnet autoencoder with 3 hidden layers for trade signals through Zorro’s neural() function:
7. Support vector machines.
Like a neural network, a support vector machine (SVM) is another extension of linear regression. When we look at the regression formula again,
we can interpret the features x n as coordinates of a n - dimensional feature space . Setting the target variable y to a fixed value determines a plane in that space, called a hyperplane since it has more than two (in fact, n-1 ) dimensions. The hyperplane separates the samples with y > o from the samples with y < 0. The a n coefficients can be calculated in a way that the distances of the plane to the nearest samples – which are called the ‘support vectors’ of the plane, hence the algorithm name – is maximum. This way we have a binary classifier with optimal separation of winning and losing samples.
The problem: normally those samples are not linearly separable – they are scattered around irregularly in the feature space. No flat plane can be squeezed between winners and losers. If it could, we had simpler methods to calculate that plane, f. i. linear discriminant analysis . But for the common case we need the SVM trick: Adding more dimensions to the feature space. For this the SVM algorithm produces more features with a kernel function that combines any two existing predictors to a new feature. This is analogous to the step above from the simple regression to polynomial regression, where also more features are added by taking the sole predictor to the n-th power. The more dimensions you add, the easier it is to separate the samples with a flat hyperplane. This plane is then transformed back to the original n-dimensional space, getting wrinkled and crumpled on the way. By clever selecting the kernel function, the process can be performed without actually computing the transformation.
Like neural networks, SVMs can be used not only for classification, but also for regression. They also offer some parameters for optimizing and possibly overfitting the prediction process:
Kernel function. You normally use a RBF kernel (radial basis function, a symmetric kernel), but you also have the choice of other kernels, such as sigmoid, polynomial, and linear. Gamma, the width of the RBF kernel Cost parameter C, the ‘penalty’ for wrong classifications in the training samples.
An often used SVM is the libsvm library. It’s also available in R in the e1071 package. In the next and final part of this series I plan to describe a trading strategy using this SVM.
8. K-Nearest neighbor.
Compared with the heavy ANN and SVM stuff, that’s a nice simple algorithm with a unique property: It needs no training. So the samples are the model. You could use this algorithm for a trading system that learns permanently by simply adding more and more samples. The nearest neighbor algorithm computes the distances in feature space from the current feature values to the k nearest samples. A distance in n-dimensional space between two feature sets (x 1 .. x n ) and (y 1 .. y n ) is calculated just as in 2 dimensions:
The algorithm simply predicts the target from the average of the k target variables of the nearest samples, weighted by their inverse distances. It can be used for classification as well as for regression. Software tricks borrowed from computer graphics, such as an adaptive binary tree (ABT), can make the nearest neighbor search pretty fast. In my past life as computer game programmer, we used such methods in games for tasks like self-learning enemy intelligence. You can call the knn function in R for nearest neighbor prediction – or write a simple function in C for that purpose.
This is an approximation algorithm for unsupervised classification. It has some similarity, not only its name, to k-nearest neighbor. For classifying the samples, the algorithm first places k random points in the feature space. Then it assigns to any of those points all the samples with the smallest distances to it. The point is then moved to the mean of these nearest samples. This will generate a new samples assignment, since some samples are now closer to another point. The process is repeated until the assignment does not change anymore by moving the points, i. e. each point lies exactly at the mean of its nearest samples. We now have k classes of samples, each in the neighborhood of one of the k points.
This simple algorithm can produce surprisingly good results. In R, the kmeans function does the trick. An example of the k-means algorithm for classifying candle patterns can be found here: Unsupervised candlestick classification for fun and profit.
10. Naive Bayes.
This algorithm uses Bayes’ Theorem for classifying samples of non-numeric features (i. e. events ), such as the above mentioned candle patterns . Suppose that an event X (for instance, that the Open of the previous bar is below the Open of the current bar) appears in 80% of all winning samples. What is then the probability that a sample is winning when it contains event X ? It’s not 0.8 as you might think. The probability can be calculated with Bayes’ Theorem:
P(Y|X) is the probability that event Y (f. i. winning) occurs in all samples containing event X (in our example, Open(1) < Open(0) ). According to the formula, it is equal to the probability of X occurring in all winning samples (here, 0.8), multiplied by the probability of Y in all samples (around 0.5 when you were following my above advice of balanced samples) and divided by the probability of X in all samples.
If we are naive and assume that all events X are independent of each other, we can calculate the overall probability that a sample is winning by simply multiplying the probabilities P (X|winning) for every event X . This way we end up with this formula:
with a scaling factor s . For the formula to work, the features should be selected in a way that they are as independent as possible, which imposes an obstacle for using Naive Bayes in trading. For instance, the two events Close(1) < Close(0) and Open(1) < Open(0) are most likely not independent of each other. Numerical predictors can be converted to events by dividing the number into separate ranges.
The Naive Bayes algorithm is available in the ubiquitous e1071 R package.
11. Decision and regression trees.
Those trees predict an outcome or a numeric value based on a series of yes/no decisions, in a structure like the branches of a tree. Any decision is either the presence of an event or not (in case of non-numerical features) or a comparison of a feature value with a fixed threshold. A typical tree function, generated by Zorro’s tree builder, looks like this:
How is such a tree produced from a set of samples? There are several methods; Zorro uses the Shannon i nformation entropy , which already had an appearance on this blog in the Scalping article. At first it checks one of the features, let’s say x 1 . It places a hyperplane with the plane formula x 1 = t into the feature space. This hyperplane separates the samples with x 1 > t from the samples with x 1 < t . The dividing threshold t is selected so that the information gain – the difference of information entropy of the whole space, to the sum of information entropies of the two divided sub-spaces – is maximum. This is the case when the samples in the subspaces are more similar to each other than the samples in the whole space.
This process is then repeated with the next feature x 2 and two hyperplanes splitting the two subspaces. Each split is equivalent to a comparison of a feature with a threshold. By repeated splitting, we soon get a huge tree with thousands of threshold comparisons. Then the process is run backwards by pruning the tree and removing all decisions that do not lead to substantial information gain. Finally we end up with a relatively small tree as in the code above.
Decision trees have a wide range of applications. They can produce excellent predictions superior to those of neural networks or support vector machines. But they are not a one-fits-all solution, since their splitting planes are always parallel to the axes of the feature space. This somewhat limits their predictions. They can be used not only for classification, but also for regression, for instance by returning the percentage of samples contributing to a certain branch of the tree. Zorro’s tree is a regression tree. The best known classification tree algorithm is C5.0 , available in the C50 package for R.
For improving the prediction even further or overcoming the parallel-axis-limitation, an ensemble of trees can be used, called a random forest . The prediction is then generated by averaging or voting the predictions from the single trees. Random forests are available in R packages randomForest , ranger and Rborist .
Conclusão.
There are many different data mining and machine learning methods at your disposal. The critical question: what is better, a model-based or a machine learning strategy? There is no doubt that machine learning has a lot of advantages. You don’t need to care about market microstructure, economy, trader psychology, or similar soft stuff. You can concentrate on pure mathematics. Machine learning is a much more elegant, more attractive way to generate trade systems. It has all advantages on its side but one. Despite all the enthusiastic threads on trader forums, it tends to mysteriously fail in live trading.
Every second week a new paper about trading with machine learning methods is published (a few can be found below). Please take all those publications with a grain of salt. According to some papers, phantastic win rates in the range of 70%, 80%, or even 85% have been achieved. Although win rate is not the only relevant criterion – you can lose even with a high win rate – 85% accuracy in predicting trades is normally equivalent to a profit factor above 5. With such a system the involved scientists should be billionaires meanwhile. Unfortunately I never managed to reproduce those win rates with the described method, and didn’t even come close. So maybe a lot of selection bias went into the results. Or maybe I’m just too stupid.
Compared with model based strategies, I’ve seen not many successful machine learning systems so far. And from what one hears about the algorithmic methods by successful hedge funds, machine learning seems still rarely to be used. But maybe this will change in the future with the availability of more processing power and the upcoming of new algorithms for deep learning.
Classification using deep neural networks: Dixon. et. al.2016 Predicting price direction using ANN & SVM: Kara. et. al.2011 Empirical comparison of learning algorithms: Caruana. et. al.2006 Mining stock market tendency using GA & SVM: Yu. Wang. Lai.2005.
The next part of this series will deal with the practical development of a machine learning strategy.
30 thoughts on “Better Strategies 4: Machine Learning”
Bela postagem. There is a lot of potential in these approach towards the market.
Btw are you using the code editor which comes with zorro? how is it possible to get such a colour configuration?
The colorful script is produced by WordPress. You can’t change the colors in the Zorro editor, but you can replace it with other editors that support individual colors, for instance Notepad++.
Is it then possible that notepad detects the zorro variables in the scripts? I mean that BarPeriod is remarked as it is with the zorro editor?
Theoretically yes, but for this you had to configure the syntax highlighting of Notepad++, and enter all variables in the list. As far as I know Notepad++ can also not be configured to display the function description in a window, as the Zorro editor does. There’s no perfect tool…
Concur with the final paragraph. I have tried many machine learning techniques after reading various ‘peer reviewed’ papers. But reproducing their results remains elusive. When I live test with ML I can’t seem to outperform random entry.
ML fails in live? Maybe the training of the ML has to be done with price data that include as well historical spread, roll, tick and so on?
I think reason #1 for live failure is data mining bias, caused by biased selection of inputs and parameters to the algo.
Thanks to the author for the great series of articles.
However, it should be noted that we don’t need to narrow our view with predicting only the next price move. It may happen that the next move goes against our trade in 70% of cases but it still worth making a trade. This happens when the price finally does go to the right direction but before that it may make some steps against us. If we delay the trade by one price step we will not enter the mentioned 30% of trades but for that we will increase the result of the remained 70% by one price step. So the criteria is which value is higher: N*average_result or 0.7*N*(avergae_result + price_step).
Bela postagem. If you just want to play around with some machine learning, I implemented a very simple ML tool in python and added a GUI. It’s implemented to predict time series.
Thanks JCL I found very interesting your article. I would like to ask you, from your expertise in trading, where can we download reliable historical forex data? I consider it very important due to the fact that Forex market is decentralized.
Desde já, obrigado!
There is no really reliable Forex data, since every Forex broker creates their own data. They all differ slightly dependent on which liquidity providers they use. FXCM has relatively good M1 and tick data with few gaps. You can download it with Zorro.
Thanks for writing such a great article series JCL… a thoroughly enjoyable read!
I have to say though that I don’t view model-based and machine learning strategies as being mutually exclusive; I have had some OOS success by using a combination of the elements you describe.
To be more exact, I begin the system generation process by developing a ‘traditional’ mathematical model, but then use a set of online machine learning algorithms to predict the next terms of the various different time series (not the price itself) that are used within the model. The actual trading rules are then derived from the interactions between these time series. So in essence I am not just blindly throwing recent market data into an ML model in an effort to predict price action direction, but instead develop a framework based upon sound investment principles in order to point the models in the right direction. I then data mine the parameters and measure the level of data-mining bias as you’ve described also.
It’s worth mentioning however that I’ve never had much success with Forex.
Anyway, best of luck with your trading and keep up the great articles!
Thanks for posting this great mini series JCL.
I recently studied a few latest papers about ML trading, deep learning especially. Yet I found that most of them valuated the results without risk-adjusted index, i. e., they usually used ROC curve, PNL to support their experiment instead of Sharpe Ratio, for example.
Also, they seldom mentioned about the trading frequency in their experiment results, making it hard to valuate the potential profitability of those methods. Por que é que? Do you have any good suggestions to deal with those issues?
ML papers normally aim for high accuracy. Equity curve variance is of no interest. This is sort of justified because the ML prediction quality determines accuracy, not variance.
Of course, if you want to really trade such a system, variance and drawdown are important factors. A system with lower accuracy and worse prediction can in fact be preferable when it’s less dependent on market condictions.
“In fact the most popular – and surprisingly profitable – data mining method works without any fancy neural networks or support vector machines.”
Would you please name those most popular & surprisingly profitable ones. So I could directly use them.
I was referring to the Indicator Soup strategies. For obvious reasons I can’t disclose details of such a strategy, and have never developed such systems myself. We’re merely coding them. But I can tell that coming up with a profitable Indicator Soup requires a lot of work and time.
Well, i am just starting a project which use simple EMAs to predict price, it just select the correct EMAs based on past performance and algorithm selection that make some rustic degree of intelligence.
Jonathan. orregogmail offers services as MT4 EA programmer.
Thanks for the good writeup. It in reality used to be a leisure account it.
Look complicated to more delivered agreeable from you!
By the way, how could we be in contact?
There are following issues with ML and with trading systems in general which are based on historical data analysis:
1) Historical data doesn’t encode information about future price movements.
Future price movement is independent and not related to the price history. There is absolutely no reliable pattern which can be used to systematically extract profits from the market. Applying ML methods in this domain is simply pointless and doomed to failure and is not going to work if you search for a profitable system. Of course you can curve fit any past period and come up with a profitable system for it.
The only thing which determines price movement is demand and supply and these are often the result of external factors which cannot be predicted. For example: a war breaks out somewhere or other major disaster strikes or someone just needs to buy a large amount of a foreign currency for some business/investment purpose. These sort of events will cause significant shifts in the demand supply structure of the FX market . As a consequence, prices begin to move but nobody really cares about price history just about the execution of the incoming orders. An automated trading system can only be profitable if it monitors a significant portion of the market and takes the supply and demand into account for making a trading decision. But this is not the case with any of the systems being discussed here.
2) Race to the bottom.
Even if (1) wouldn’t be true and there would be valuable information encoded in historical price data, you would still face following problem: there are thousands of gold diggers out there, all of them using similar methods and even the same tools to search for profitable systems and analyze the same historical price data. As a result, many of them will discover the same or very similar “profitable” trading systems and when they begin actually trading those systems, they will become less and less profitable due to the nature of the market.
The only sure winners in this scenario will be the technology and tool vendors.
I will be still keeping an eye on your posts as I like your approach and the scientific vigor you apply. Your blog is the best of its kind – keep the good work!
One hint: there are profitable automated systems, but they are not based on historical price data but on proprietary knowledge about the market structure and operations of the major institutions which control these markets. Let’s say there are many inefficiencies in the current system but you absolutely have no chance to find the information about those by analyzing historical price data. Instead you have to know when and how the institutions will execute market moving orders and front run them.
Thanks for the extensive comment. I often hear these arguments and they sound indeed intuitive, only problem is that they are easily proven wrong. The scientific way is experiment, not intuition. Simple tests show that past and future prices are often correlated – otherwise every second experiment on this blog had a very different outcome. Many successful funds, for instance Jim Simon’s Renaissance fund, are mainly based on algorithmic prediction.
One more thing: in my comment I have been implicitly referring to the buy side (hedge funds, traders etc) not to the sell side (market makers, banks). The second one has always the edge because they sell at the ask and buy at the bid, pocketing the spread as an additional profit to any strategy they might be running. Regarding Jim Simon’s Renaissance: I am not so sure if they have not transitioned over the time to the sell side in order to stay profitable. There is absolutely no information available about the nature of their business besides the vague statement that they are using solely quantitative algorithmic trading models…
Thanks for the informative post!
Regarding the use of some of these algorithms, a common complaint which is cited is that financial data is non-stationary…Do you find this to be a problem? Couldn’t one just use returns data instead which is (I think) stationary?
Yes, this is a problem for sure. If financial data were stationary, we’d all be rich. I’m afraid we have to live with what it is. Returns are not any more stationary than other financial data.
Hello sir, I developed some set of rules for my trading which identifies supply demand zones than volume and all other criteria. Can you help me to make it into automated system ?? If i am gonna do that myself then it can take too much time. Please contact me at svadukiagmail if you are interested.
Sure, please contact my employer at infoopgroup. de. They’ll help.
I have noticed you don’t monetize your page, don’t waste your traffic,
Você pode ganhar bucks extra todos os meses porque você tem conteúdo de alta qualidade.
If you want to know how to make extra $$$, search for: Mrdalekjd methods for $$$
Technical analysis has always been rejected and looked down upon by quants, academics, or anyone who has been trained by traditional finance theories. I have worked for proprietary trading desk of a first tier bank for a good part of my career, and surrounded by those ivy-league elites with background in finance, math, or financial engineering. I must admit none of those guys knew how to trade directions. They were good at market making, product structures, index arb, but almost none can making money trading directions. Por quê? Because none of these guys believed in technical analysis. Then again, if you are already making your millions why bother taking the risk of trading direction with your own money. For me luckily my years of training in technical analysis allowed me to really retire after laying off from the great recession. I look only at EMA, slow stochastics, and MACD; and I have made money every year since started in 2009. Technical analysis works, you just have to know how to use it!!

No comments:

Post a Comment