Saturday 7 April 2018

Sinais comerciais em r


Sinais comerciais em r
Obter através da App Store Leia esta publicação em nosso aplicativo!
Criando Sinais de Negociação em R.
Estou construindo uma estratégia comercial e estou preso em duas áreas-chave. Ao usar Stoch e MACD no quantmod, estou tentando criar um sinal quando o estocástico lento atravessa o estocástico rápido (1) e o visto-versa (-1) e o plano quando estiver no meio (0). MACD o código é idêntico exceto com os nomes das colunas MACD e Signal. Por fim, estou tentando mesclar os três sinais para criar um sinal mestre quando todos os três sinais são iguais a 1, -1, 0.
Atualização: Eu reparei todos os loops desagradáveis ​​usando uma diferença, em vez disso, após essa resposta.
É assim que eu abordarei esse problema. Você está calculando todas as posições que possuem os relacionamentos desejados. Você quer apenas a primeira posição que satisfaça o sinal de negociação para atuar nele o mais rápido possível.
Eu configuraria o sinal da banda Bollinger como este:
Eu criaria o sinal estocástico como este:
Uma vez que você calcula a diferença, você deseja encontrar o primeiro crossover onde um é mais alto do que o outro, então você precisa considerar as posições i th e i-1 th. Além disso, o sinal será mais forte se você estiver em território de sobrecompra ou sobrevenda (0,8 ou 0,2).
Da mesma forma para MACD:
Agora nós os fundimos e calculamos o sinal de combinação:
Se fosse eu, preferiria ter uma soma dos sinais, porque isso lhe dirá como é confiável cada sinal. Se você tem um 3, isso é stong, mas um 1 ou 2 não é tão forte. Então eu iria com a soma como o sinal combinado.
Agora, tudo é uma matriz com todos os sinais e a última coluna é a força combinada do sinal.
Também pense em como isso pode não dar-lhe um bom sinal. Usando a abordagem para este gráfico, os sinais mais fortes que recebo são -2, e eu só recebo 5 ocasiões. Tipo de estranho, uma vez que o gráfico vai direto, mas não há compras fortes.
Estes sinais de venda apenas dão uma baixa desvantagem e, em seguida, os foguetes da carta maior. Claro que tudo depende do estoque etc.
Você também obtém situações como esta:
Alguns indicadores são mais rápidos ou mais lentos que outros. Esta seria a minha abordagem, mas você deve fazer testes baseados em larga escala e determinar se você acha que estes serão negócios acionáveis ​​e se você fizer qualquer dinheiro agindo neles menos comissão e manter a duração.

Sinais comerciais em r
Freqüentemente, as estratégias de mercado baseadas em algoritmos robustos e estruturais têm resultados altamente prósperos comparando-se com estratégias complexas e múltiplas.
Na estratégia R, a linha de resistência pode ser detectada conectando os 3 últimos picos consecutivos na tendência Bullish ou 3 vales sucessivos na tendência Baixa com um Trendline.
Se o preço quebrar este Trendline e reboque de volta, o poderoso sinal comercial pode ser detectado quando um candelabro poderoso aparece no local onde o preço cruza o Trendline em direção a direção inversa da principal tendência do mercado.
Haveria dois preços Take Beneit considerando outras confirmações e sinais, portanto, dois preços Stop Loss.
Comprar sinal.
Em uma tendência descendente, o preço do mercado havia formado 3 vales consecutivos conectados pela Trendline.
O Trendline é uma poderosa linha de resistência com certa inclinação descendente. Depois que Price tinha atravessado a linha de Resistência para baixo, ele se recuperou.
Se o preço ultrapassar a linha de Resistência em direção a cima onde um candelabro poderoso aparece no ponto de fuga, então um sinal de Compra será gerado no caso de o Preço próximo do castiçal dado estar acima da linha de Resistência.
Preço de entrada: Sobre o preço elevado do castiçal no ponto de partida Tire o lucro: Comprimento do último candelabro sobre o preço de entrada A altura do ponto de retorno ao preço fechado do último castiçal, sobre o preço de entrada Perda de parada: o baixo preço do último Candelabro Abaixo do último preço do vale.
Sinal de venda.
Em uma tendência ascendente, o preço do mercado havia formado 3 picos consecutivos conectados pela Trendline.
O Trendline é uma poderosa linha de resistência com certa inclinação ascendente. Depois de Price ter atravessado a linha de Resistência para cima, ele se recuperou.
Se o preço ultrapassar a linha de Resistência para a parte inferior onde um candelabro poderoso aparece no ponto de fuga, um sinal de Vender será gerado no caso de o Close Price do candelabro dado estar abaixo da linha de Resistência.
Preço de entrada: abaixo do preço baixo do castiçal no ponto de partida Tome o lucro: Comprimento do último castiçal abaixo Preço de entrada A altura do ponto de retorno ao preço fechado do último castiçal, abaixo do preço de entrada Perda de parada: o preço alto do último Castiçal Sobre o último pico do preço.

Extração de sinal híbrido, previsão e negociação financeira.
iMetrica: econometria e estratégias de negociação financeira.
Tagged com negociação de alta freqüência em R.
Negociação financeira de alta freqüência em futuros de índices com MDFA e R: um exemplo com o EURO STOXX50.
Figura 1: Desempenho na amostra e fora da amostra (observações 240-457) do sinal de negociação para os futuros do índice Euro Stoxx50 com vencimento em 18 de março (STXE H3) durante o período de 1-9-2018 e 2-1 -2018, usando log-returns de 15 minutos. As linhas pontilhadas pretas indicam um sinal de compra / longo e as linhas pontilhadas azuis indicam uma venda / curta (parte superior).
Neste segundo tutorial sobre a construção de sinais de negociação financeira de alta freqüência usando a abordagem de filtro direto multivariante em R, foco no primeiro exemplo do meu artigo anterior sobre engenharia de sinal na negociação de alta freqüência de futuros de índices financeiros, onde considero registro de 15 minutos - retornos dos futuros do índice Euro STOXX50 com vencimento em 18 de março de 2018 (STXE H3). Como mencionei na introdução, adicionei um passo um pouco novo na minha abordagem para a construção de sinais para observações intradias, pois estava estudando o problema de variações próximas ao domínio de freqüência. Com dados de log-retorno de 15 minutos, vejo a estrutura de freqüência relacionada à variação de fechar-abrir no preço, ou seja, quando o preço ao fechamento das horas de mercado difere significativamente do preço aberto, um efeito I & # 8217 Eu mencionei nos meus dois artigos anteriores sobre os dados de retorno de log intradía. Eu vou mostrar (desta vez em R) como o MDFA pode aproveitar esta variação de preço e lucro de cada um por # 8216; previsão & # 8217; com o sinal extraído o salto ou queda no preço no aberto do próximo dia de negociação. Parece bom para ser verdade, certo? Eu demonstrei neste artigo como é possível.
O primeiro passo depois de analisar o preço do log e os dados de log-return do objeto que está sendo negociado é construir o periodograma dos dados na amostra que estão sendo comercializados. Neste exemplo, eu trabalho com o mesmo período de tempo que fiz com o meu tutorial R anterior, considerando a parte da amostra em meus dados de 1-4-2018 para 1-23-2018, com a minha saída de amostra O intervalo de dados é de 1-23-2018 a 2-1-2018, que será usado para analisar o verdadeiro desempenho do sinal de negociação. Os dados STXE e as séries explicativas do EURO STOXX50 são primeiro carregados em R e, em seguida, o periodograma é calculado da seguinte forma.
Você notará no periodograma dos log-return STXE na amostra que eu identifiquei um pico espectral entre duas linhas tracejadas azuis. Esse pico corresponde a um ciclo intrinsecamente importante nos log-returns de 15 minutos de futuros de índice que dá acesso à previsão da variação do preço próximo ao aberto. Como você verá, o ciclo flui fluentemente através dos 26 intervalos de 15 minutos durante cada dia de negociação e cruzará zero em (geralmente) um a dois pontos durante cada dia de negociação para indicar se vai longo ou curto índice para o próximo dia. Eu deduzi esta faixa de freqüência ótima em uma análise prévia desses dados que fiz usando o meu kit de ferramentas de filtro de destino no iMetrica (ver artigo anterior). Essa faixa de freqüência dependerá da freqüência das observações intradias e também pode depender do índice (mas em meus experimentos, esse intervalo geralmente é consistente entre 0,23 e 0,32 para a maioria dos futuros de índice usando observações de 15 minutos). Assim, no código R acima, eu definei um ponto de corte de freqüência em .32 e pontos de passagem de banda superior e inferior em .32 e .23, respectivamente.
Figura 2: Periodograma dos dados STXE log-return. O pico espectral é extraído e destacado entre as duas linhas tracejadas vermelhas.
Nesta primeira parte do tutorial, eu extrai esse ciclo responsável por marcar as variações de fechar-abrir e mostrar o quão bom ele pode executar. Como eu mencionei em meus artigos anteriores sobre a extração de sinal de negociação, eu gostaria de começar com a solução do meio-quadrado (ou seja, sem personalização ou regularização) para o problema de extração para ver exatamente qual o tipo de parametrização que talvez eu precise. Para produzir a solução do quadrado médio da baunilha, configurei todos os parâmetros para 0,0 e, em seguida, computei o filtro chamando a função MDFA principal (mostrado abaixo). A função IMDFA retorna um objeto com os coeficientes de filtro e o sinal na amostra. Ele também traça a função de transferência simultânea para ambos os filtros, juntamente com os coeficientes de filtro para aumentar o atraso, mostrado na Figura 3.
Figura 3: Funções de transferência simultânea para as séries STXE (vermelho) e explicativas (ciano) (topo). Coeficientes para STXE e séries explicativas (inferior).
Observe o vazamento de ruído após a faixa de parada no filtro concorrente e a rugosidade de ambos os conjuntos de coeficientes de filtro (devido à superposição). Gostaríamos de suavizar ambos os casos, além de permitir que os coeficientes do filtro diminuam à medida que o atraso aumenta. Isso garante propriedades mais consistentes na amostra e fora da amostra do filtro. Primeiro, aplico um pouco de suavização para o stopband aplicando um parâmetro de peso do valor de 16 e, para compensar ligeiramente essa suavidade aprimorada, aperfeiçoo a pontualidade ajustando o parâmetro lambda para 1. Depois de perceber a melhora na suavidade dos coeficientes de filtro, eu então prosseguir com a regularização e concluir com os seguintes parâmetros.
Figura 4: Transferir funções e coeficientes após suavização e regularização.
Uma grande melhoria em relação à solução do meio-quadrado. Praticamente nenhum vazamento de ruído na banda de parada passou e os coeficientes se deterioram lindamente com a suavidade perfeita alcançada. Observe as duas funções de transferência escolhendo perfeitamente o pico espectral que é intrínseco ao ciclo de fechamento aberto que mencionei estava entre 0,23 e 0,32. Para verificar que esses coeficientes de filtro conseguem a extração do ciclo de fechamento para abrir, computo o sinal de negociação do objeto imdfa e depois traço-o contra os retornos de log da STXE. Eu então calculo os negócios na amostra usando o sinal e o preço de log da STXE. O código R está abaixo e as parcelas são mostradas nas Figuras 5 e 6.
Figura 5: O sinal na amostra e os retornos de registro do SXTE em observações de 15 minutos de 1-9-2018 para 1-23-2018.
A Figura 5 mostra os dados de retorno de log e o sinal de negociação extraído dos dados. Os picos nos dados de retorno de registro representam os saltos abertos no contrato de futuros do índice STOXX Europe 50, ocorrendo a cada 27 observações. Mas observe como o sinal é regular, e quão consistente esta faixa de freqüência é encontrada nos dados de registro-retorno, quase como uma onda sinusoidal perfeita, com um ciclo completo ocorrendo quase todas as 27 observações. Este sinal desencadeia negócios que são mostrados na Figura 6, onde as linhas pontilhadas pretas são compradas / longas e as linhas pontilhadas azuis são vendidas / shorts. O sinal é extremamente consistente em encontrar os momentos oportunos para comprar e vender nos picos quase ótimos, como nas observações 140, 197 e 240. Ele também "prediz" # 8217; o salto ou queda do futuro do índice EuroStoxx50 para o próximo dia de negociação, desencadeando o sinal de compra / venda necessário, como nas observações 19, 40, 51, 99, 121, 156 e, 250. O desempenho dessa negociação, A amostra é mostrada na Figura 7.
Figura 6: Os negócios na amostra. As linhas pontilhadas pretas são compradas / longas e as linhas pontilhadas azuis são vendidas / curtas.
Figura 7: O desempenho na amostra do sinal de negociação.
Agora, para o verdadeiro teste decisivo no desempenho deste sinal extraído, precisamos aplicar o filtro fora de amostra para verificar a consistência, não só no desempenho, mas também nas características comerciais. Para fazer isso em R, vinculamos os dados na amostra e fora da amostra juntos e, em seguida, aplicamos o filtro no conjunto fora da amostra (precisando as observações L-1 finais da porção na amostra). O sinal resultante mostrado na Figura 8.
O sinal e os dados de retorno de registro Observe que o sinal executa consistentemente fora de amostra até à direita em torno da observação 170 quando os retornos de registro se tornam cada vez mais voláteis. O ciclo intrínseco entre as frequências .23 e .32 foi desacelerado devido a esta volatilidade aumentada e pode afetar o desempenho da negociação.
Figura 8: Sinal produzido fora da amostra em 210 observações e dados de retorno de registro de STXE.
O desempenho de negociação total na amostra mais fora de amostra é mostrado na Figura 9 e 10, sendo os 210 pontos finais fora de amostra. O desempenho fora da amostra é muito parecido com o desempenho em amostra que tivemos, com uma negociação sistemática muito clara exposta pela & # 8216; previsão & # 8217; No próximo dia, o salto ou a queda do próximo-a-aberto, de forma consistente, desencadeando o sinal de compra / venda necessário, como nas observações 310, 363, 383 e 413, com apenas uma perda até o dia final de negociação. A maior volatilidade durante o último dia do período de amostragem prejudica o sinal cíclico e falha ao comércio de forma sistemática como ocorreu durante as primeiras 420 observações.
Figura 9: O total na amostra mais fora de amostra compra e vende.
Figura 10: Desempenho total em períodos in-sample e out-of-sample.
Com este tipo de desempenho, tanto na amostra como fora da amostra, e os padrões de negociação bastante consistentes e metodológicos que este sinal fornece, parece que tentar melhorar, seria uma tarefa inútil. Por que tentar consertar o que # 8217; s não & # 8220; quebrado & # 8221 ;. Mas, sendo o perfeccionista que sou, luto por um "melhor" e # 8221; filtro. Se houvesse apenas uma maneira de 1) manter os efeitos de negociação cíclicos consistentes como antes 2) & # 8216; prever & # 8217; no próximo dia, o salto / queda do futuro do Euro Stoxx50 como antes, e 3) evitar períodos voláteis para eliminar o comércio errado, onde o sinal foi pior. Depois de horas passadas na iMetrica, imaginei como fazê-lo. É aqui que a engenharia avançada de sinais comerciais entra em jogo.
O primeiro passo foi incluir todas as freqüências mais baixas abaixo de .23, que não estavam incluídas no meu sinal de negociação anterior. Devido à baixa quantidade de atividade nessas freqüências mais baixas, isso só deve fornecer o efeito ou um aumento & # 8217; ou um & # 8216; push & # 8217; ou o sinal local, enquanto ainda mantém o componente cíclico. Então, depois de mudar meu filtro de passagem baixa com o ponto de corte ajustado, eu computei o filtro com o novo design de passagem baixa. As funções de transferência para os coeficientes do filtro são mostradas abaixo na Figura 11, com o gráfico colorido vermelho a função de transferência para o STXE. Observe que a função de transferência para a série explicativa ainda privilegia o pico espectral entre 0,23 e 0,32, com apenas um ligeiro aumento na freqüência zero (compare isso com o design de passagem de banda na Figura 4, não mudou muito). O problema é que o pico excede 1,0 na banda passante, e isso ampliará o componente cíclico extraído do log-return. Pode ser bom, comercializado, mas não o que eu procuro fazer. Para o filtro STXE, obtemos um pouco mais de elevação na freqüência zero, no entanto, isso foi compensado com uma diminuição da extração do ciclo entre as freqüências .23 e .32. Além disso, uma pequena quantidade de ruído entrou no stopband, outro fator que devemos acalmar.
Figura 11: As funções de transferência simultânea depois de mudar para filtro de passagem baixa.
Para melhorar as propriedades de filtro simultâneas para ambos, eu aumento o tempo de limpeza de suavização para 26, o que afetará o lambda_smooth, portanto, diminuo para .70. Isso me dá um par de funções de transferência muito melhor, mostrado na Figura 12. Observe que o pico na função de transferência de série explicativa está agora muito mais próximo de 1,0, exatamente o que queremos.
Figura 12: As funções de transferência simultânea depois de mudar para o filtro de passagem baixa, aumentando o peso do ventilador para 26 e diminuindo lambda_smooth para .70.
Eu ainda não estou satisfeito com o elevador na freqüência zero para a série STXE. Em aproximadamente .5 na freqüência zero, o filtro pode não fornecer empurrar ou puxar o que preciso. A única maneira de garantir uma elevação garantida na série STXE log-return é empregar restrições nos coeficientes de filtro, de modo que a função de transferência seja uma na freqüência zero. Isso pode ser conseguido definindo i1 como verdadeiro na chamada de função IMDFA, o que efetivamente garante que a soma dos coeficientes de filtro seja uma. Depois de fazer isso, recebo as seguintes funções de transferência e os respectivos coeficientes de filtro.
Figura 13: função de transferência e coeficientes de filtro depois de definir a restrição de coeficiente i1 como verdadeira.
Agora, isso é exatamente o que eu estava procurando. Não só a função de transferência para a série explicativa mantém intacto o importante ciclo de fechamento aberto, mas também apliquei o elevador que preciso para a série STXE. Os coeficientes ainda permanecem suaves com uma boa propriedade decadente no final. Com os novos coeficientes de filtro, eu os apliquei nos dados tanto na amostra como fora da amostra, obtendo o sinal comercial mostrado na Figura 14. Possui exatamente as propriedades que eu estava procurando. O componente cíclico próximo a aberto ainda está sendo extraído (graças, em parte, à série explicativa), e ainda é relativamente consistente, embora não tanto quanto o design de passagem de banda pura. O recurso que eu gosto é o seguinte: quando os dados de log-return divergem do componente cíclico, com o aumento da volatilidade, o filtro STXE reage pressionando o sinal para evitar qualquer negociação errônea. Isso pode ser visto nas observações 100 a 120 e depois nas observações 390 até o final da negociação. A Figura 15 (da mesma forma que a Figura 1 na parte superior do artigo) mostra os negócios e o desempenho resultantes produzidos na amostra e fora da amostra por este sinal. Esta é a arte do pessoal de engenharia do sinal meticuloso.
Figura 14: Sinal na amostra e fora da amostra produzido a partir da passagem baixa com restrições de coeficientes i1.
Com apenas duas perdas sofridas fora da amostra durante os aproximadamente 9 dias de negociação, o filtro é muito mais metodologicamente do que antes. Aviso durante os últimos dois dias de negociação, quando a volatilidade apanhou, o sinal deixa de ser comercializado, pois está sendo pressionado. Até continua a prever & # 8217; O salto / queda do close-to-open corretamente, como nas observações 288, 321 e 391. O último comércio feito foi uma posição de venda / venda curta, com o sinal tendendo para baixo no final. O filtro está em posição de fazer um enorme ganho com esta sinalização oportuna de uma posição curta em 391, determinando corretamente uma grande queda no próximo dia de negociação, e depois aguardando a negociação volátil. O ganho deve ser grande, não importa o que aconteça.
Figura 15: Desempenho na amostra e fora da amostra do design do filtro constrangido i1.
Uma coisa que mencionei antes de concluir é que fiz um pequeno ajuste ao design do filtro depois de empregar a restrição i1 para obter os resultados mostrados na Figura 13-15. Eu deixarei isso como um exercício para o leitor deduzir o que fiz. Dica: veja os graus de liberdade congelados antes e depois de aplicar a restrição i1. Se você ainda tem dificuldade em encontrar o que eu fiz, envie-me um e-mail e eu irei dar mais sugestões.
Conclusão.
O desempenho geral do primeiro filtro construído, em relação ao retorno total do investimento fora da amostra, foi superior ao segundo. No entanto, esse desempenho superior vem apenas com a suposição de que o componente do ciclo definido entre as freqüências .23 e .32 continuará presente nas futuras observações da STXE até a expiração. Se a volatilidade aumentar e esse ciclo intrínseco deixa de existir nos dados de log-return, o desempenho irá deteriorar-se.
Para uma abordagem melhor e mais confortável que lida com a mudança de condições de índice volátil, eu optaria por garantir que o viés local esteja presente no sinal. Isto irá efetivamente empurrar ou puxar o sinal para baixo ou para cima quando o ciclo intrínseco é fraco no aumento volatilidade, resultando em uma retração na atividade de negociação.
Como antes, você pode adquirir os dados de alta freqüência usados ​​neste tutorial, solicitando-o por e-mail.
Negociação financeira de alta freqüência no FOREX com MDFA e R: um exemplo com o iene japonês.
Figura 1: In-sample (observações 1-250) e desempenho fora da amostra do sinal de negociação incorporado neste tutorial usando MDFA. (Top) O preço de registro do Yen (FXY) em intervalos de 15 minutos e os negócios gerados pelo sinal de negociação. Aqui a linha preta é uma compra (longa), o azul é vendido (posição curta). (Fundo) Os retornos acumulados (caixa) gerados pela negociação, em porcentagem obtida ou perdida.
No meu artigo anterior sobre o comércio de alta frequência no iMetrica no FOREX / GLOBEX, introduzi algumas estratégias robustas de extração de sinal no iMetrica usando a abordagem de filtro direto multidimensional (MDFA) para gerar sinais de alto desempenho para negociação no mercado de câmbio e Futuros . Neste artigo, faço uma breve falta de ausência do meu mundo de desenvolvimento de sinais de negociação financeira no iMetrica e migre para uma linguagem uber-popular usada em finanças devido à sua exuberante variedade de pacotes, gerenciamento rápido de dados e manipulação de gráficos e de Claro que é livre (como na fala e cerveja) em quase qualquer plataforma de computação do mundo.
Este artigo oferece um tutorial de introdução no uso de R para negociação de alta freqüência no mercado FOREX usando o pacote R para MDFA (oferecido por Herr Doktor Marc Wildi von Bern) e algumas estratégias desenvolvidas pela I & # 8217 para gerar sinais de negociação financeiramente robustos . Para este tutorial, considero o segundo exemplo dado no meu artigo anterior, onde criei um sinal comercial para logaritmos de 15 minutos do iene japonês (de abertura de sino a mercado fechado EST). Isso apresentou desafios um pouco novos do que antes, pois as variações de salto fechadas para abrir são muito maiores que as geradas por retornos horários ou diários. Mas, como mostrei, essas variações maiores no preço fechado aberto não representavam problemas para o MDFA. Na verdade, explorou esses saltos e ganhou grandes lucros ao prever a direção do salto. A Figura 1 na parte superior deste artigo mostra o desempenho na amostra (observações 1-250) e fora da amostra (observações 251 em diante) do filtro que vou construir na primeira parte deste tutorial.
Ao longo deste tutorial, eu tento replicar esses resultados que eu construí no iMetrica e expandi-los um pouco usando o idioma R e a implementação do MDFA disponível aqui. Os dados que consideramos são log-devoluções de 15 minutos do iene de 4 de janeiro a 17 de janeiro e eu os guardo como um arquivo. RData dado por ld_fxy_insamp. Eu tenho uma série explicativa adicional embutida no arquivo. RData que I & # 8217; m usando para prever o preço do iene. Além disso, eu também usarei price_fxy_insamp qual é o preço de registro do iene, usado para calcular o desempenho (compra / venda) do sinal de negociação. O ld_fxy_insamp será usado como dados na amostra para construir o filtro eo sinal de negociação para FXY. Para obter esses dados para que você possa executar esses exemplos em casa, envie-me um e-mail e I & # 8217; envie-lhe todos os arquivos. RData necessários (os dados na amostra e fora da amostra) em um arquivo. zip. Olhando rapidamente nos dados do ld_fxy_insamp, veremos os retornos de log do iene a cada 15 minutos começando no mercado aberto (UTC). Os dados de destino (Ien) estão na primeira coluna, juntamente com as duas séries explicativas (Yen e outro recurso co-integrado com movimento de ienes).
2018-01-04 13:30:00 0.000000e + 00 0.000000e + 00 0.0000000000.
2018-01-04 13:45:00 4.763412e-03 4.763412e-03 0.0033465833.
2018-01-04 14:00:00 -8.966599e-05 -8.966599e-05 0.0040635638.
2018-01-04 14:15:00 2.597055e-03 2.597055e-03 -0.0008322064.
2018-01-04 14:30:00 -7.157556e-04 -7.157556e-04 0.0020792190.
2018-01-04 14:45:00 -4.476075e-04 -4.476075e-04 -0.0014685198.
Iniciando, para começar a construir o primeiro sinal de negociação para o iene, começamos por carregar os dados no nosso ambiente R, definimos alguns parâmetros iniciais para a chamada de função MDFA e, em seguida, calculamos os DFTs e o periodograma para o iene.
Como mencionei nos meus artigos anteriores, minha estratégia passo-a-passo para a construção de sinais comerciais sempre começa por uma rápida análise do periodograma do objeto a ser negociado. Segurando a chave para fornecer informações sobre as características de como o recurso é comercializado, o periodograma é uma ferramenta essencial para navegar como o extractor é escolhido. Aqui, procuro os principais picos espectrales que correspondem no domínio do tempo a como e onde o meu sinal irá desencadear negócios de compra / venda. A Figura 2 mostra o periodograma dos retornos de log de 15 minutos do iene japonês durante o período in-sample de 4 de janeiro a 17 de janeiro de 2018. As setas apontam para os principais picos espectrales que procuro e fornece um guia sobre como eu irá definir a minha função. As linhas pontilhadas pretas indicam os dois cortes de freqüência que considerarei neste exemplo, o primeiro ser e o segundo em. Observe que ambos os pontos de corte são definidos diretamente após um pico espectral, algo que eu recomendo. Na negociação de alta freqüência no FOREX usando o MDFA, como veremos, o truque é buscar o pico espectral que explica a variação do fechamento no preço da moeda estrangeira. Queremos aproveitar esse pico espectral, pois é aqui que os grandes ganhos em troca de moeda estrangeira usando o MDFA ocorrerão.
Figura 2: Periodograma de FXY (Yen japonês), juntamente com picos espectrales e dois pontos de corte de freqüência diferentes.
Em nosso primeiro exemplo, consideramos a maior freqüência como o ponto de corte para o ajuste (a linha mais direita na figura do periodograma). Em seguida, inicialmente configurei os parâmetros de tempo e suavidade, e expweight para 0 juntamente com a configuração de todos os parâmetros de regularização para 0 também. Isso me dará um barómetro para onde e quanto ajustar os parâmetros do filtro. Ao selecionar o comprimento do filtro, meus estudos empíricos sobre inúmeras experiências na construção de sinais comerciais usando o iMetrica demonstraram que um & # 8216; good & # 8217; A escolha é em qualquer lugar entre 1/4 e 1/5 do comprimento total da amostra dos dados da série temporal. Claro, o comprimento depende da frequência das observações de dados (isto é, 15 minutos, hora, diária, etc.), mas, em geral, você provavelmente nunca precisará mais do que ser maior que 1/4 no tamanho da amostra. Caso contrário, a regularização pode tornar-se demasiado pesada para lidar de forma eficaz. Neste exemplo, o comprimento total na amostra é 335 e, portanto, eu configurei o que eu farei para o restante deste tutorial. Em qualquer caso, o comprimento do filtro não é o parâmetro mais importante a considerar na construção de bons sinais comerciais. Para uma boa seleção robusta dos pares de parâmetros de filtro com séries explicativas apropriadas, os resultados do sinal de negociação em comparação com, digamos, dificilmente devem diferir. Se o fizerem, a parametrização não é suficientemente robusta.
Depois de carregar os dados de registro-retorno na amostra juntamente com o preço de registro correspondente do iene para calcular o desempenho da negociação, procedemos em R para definir as configurações de filtro inicial para a rotina de MDFA e, em seguida, computa o filtro usando a função IMDFA_comp. Isso retorna o i_mdfa & amp; coeficientes de retenção de objetos, funções de resposta de freqüência e estatísticas de filtro, juntamente com o sinal produzido para cada série explicativa. Combinamos esses sinais para obter o sinal comercial final na amostra. Tudo isso é feito em R da seguinte maneira:
As funções de resposta de frequência resultantes do filtro e os coeficientes são traçados na figura abaixo.
Figura 3: As funções de resposta de frequência do filtro (superior) e os coeficientes de filtro (abaixo)
Observe que a abundância de ruído ainda presente passou a freqüência de corte. Isso é melhorado pelo aumento do parâmetro de suavidade do tempo de espera. Os coeficientes para cada série explicativa mostram alguma correlação em seu movimento à medida que os atrasos aumentam. No entanto, a suavidade e deterioração dos coeficientes deixa muito a desejar. Vamos remediar isso introduzindo parâmetros de regularização. Os gráficos do sinal de troca na amostra e o desempenho na amostra do sinal são mostrados nas duas figuras abaixo. Observe que o sinal comercial se comporta muito bem na amostra. No entanto, os olhares podem enganar. Essa performance estelar deve-se, em grande parte, a um fenômeno de filtragem chamado over-fitting. Pode-se deduzir que a superposição é o culpado aqui, simplesmente olhando o pouco de coeficientes junto com o número de graus de liberdade congelados, que neste exemplo é aproximadamente 174 (de 174), muito alto. Gostaríamos de obter esse número em cerca de metade da quantidade total de graus de liberdade (número de séries explicativas x L).
Figura 4: O sinal de negociação e os dados de log-return do iene.
O desempenho na amostra desse filtro demonstra o tipo de resultados que gostaríamos de ver após a aplicação da regularização. Mas agora vem para os efeitos sóbrios da superposição. Aplicamos essas coeficientes de filtro a 200 observações de 15 minutos do iene e as séries explicativas de 18 de janeiro a 1 de fevereiro de 2018 e comparamos com as características na amostra. Para fazer isso em R, primeiro carregamos os dados fora da amostra no ambiente R e, em seguida, aplicamos o filtro aos dados fora da amostra que eu defini como x_out.
O gráfico da Figura 5 mostra o sinal de troca fora da amostra. Observe que o sinal não é tão suave como foi na amostra. A superação dos dados em algumas áreas também está obviamente presente. Embora as características de superposição fora da amostra do sinal não sejam horrivelmente suspeitas, eu não confiaria nesse filtro para produzir retornos estelares no longo prazo.
Figura 5: Filtro aplicado a 200 observações de 15 minutos de Yen fora da amostra para produzir sinal comercial (mostrado em azul)
Seguindo a análise prévia da solução do meio-quadrado (sem personalização ou regularização), agora procedemos a limpar o problema da sobreposição que era aparente nos coeficientes, além de acalmar o ruído na banda de parada (freqüências depois). Para escolher os parâmetros de suavização e regularização, uma abordagem é primeiro aplicar primeiro o parâmetro de suavidade, pois isso geralmente alisará os coeficientes ao atuar como um regulador & # 8216; pré-regulador e, em seguida, avance para selecionar apropriado controles de regularização. Ao olhar para os coeficientes (Figura 3), podemos ver que uma quantidade razoável de suavização é necessária, com apenas um ligeiro toque de decaimento. Para selecionar esses dois parâmetros em R, uma opção é usar o otimizador Troikaner (encontrado aqui) para encontrar uma combinação adequada (eu tenho uma abordagem algorítmica de molho secreto que desenvolvi para iMetrica para escolher combinações ótimas de parâmetros dados um extractor e um indicador de desempenho , although it’s lengthy (even in GNU C) and cumbersome to use, so I typically prefer the strategy discussed in this tutorial). In this example, I began by setting the lambda_smooth to .5 and the decay to (.1,.1) along with an expweight smoothness parameter set to 8.5. After viewing the coefficients, it still wasn’t enough smoothness, so I proceeded to add more finally reaching .63, which did the trick. I then chose lambda to balance the effects of the smoothing expweight (lambda is always the last resort tweaking parameter).
Figure 6 shows the resulting frequency response function for both explanatory series (Yen in red). Notice that the largest spectral peak found directly before the frequency cutoff at is being emphasized and slightly mollified (value near .8 instead of 1.0). The other spectral peaks below are also present. For the coefficients, just enough smoothing and decay was applied to keep the lag, cyclical, and correlated structure of the coefficients intact, but now they look much nicer in their smoothed form. The number of freezed degrees of freedom has been reduced to approximately 102.
Figure 6: The frequency response functions and the coefficients after regularization and smoothing have been applied (top). The smoothed coefficients with slight decay at the end (bottom). Number of freezed degrees of freedom is approximately 102 (out of 172).
Along with an improved freezed degrees of freedom and no apparent havoc of overfitting, we apply this filter out-of-sample to the 200 out-of-sample observations in order to verify the improvement in the structure of the filter coefficients (shown below in Figure 7). Notice the tremendous improvement in the properties of the trading signal (compared with Figure 5). The overshooting of the data has be eliminated and the overall smoothness of the signal has significantly improved. This is due to the fact that we’ve eradicated the presence of overfitting.
Figure 7: Out-of-sample trading signal with regularization.
With all indications of a filter endowed with exactly the characteristics we need for robustness, we now apply the trading signal both in-sample and out of sample to activate the buy/sell trades and see the performance of the trading account in cash value. When the signal crosses below zero, we sell (enter short position) and when the signal rises above zero, we buy (enter long position).
The top plot of Figure 8 is the log price of the Yen for the 15 minute intervals and the dotted lines represent exactly where the trading signal generated trades (crossing zero). The black dotted lines represent a buy (long position) and the blue lines indicate a sell (and short position). Notice that the signal predicted all the close-to-open jumps for the Yen (in part thanks to the explanatory series). This is exactly what we will be striving for when we add regularization and customization to the filter. The cash account of the trades over the in-sample period is shown below, where transaction costs were set at .05 percent. In-sample, the signal earned roughly 6 percent in 9 trading days and a 76 percent trading success ratio.
Figure 8: In-sample performance of the new filter and the trades that are generated.
Now for the ultimate test to see how well the filter performs in producing a winning trading signal, we applied the filter to the 200 15-minute out-of-sample observation of the Yen and the explanatory series from Jan 18th to February 1st and make trades based on the zero crossing. The results are shown below in Figure 9. The black lines represent the buys and blue lines the sells (shorts). Notice the filter is still able to predict the close-to-open jumps even out-of-sample thanks to the regularization. The filter succumbs to only three tiny losses at less than .08 percent each between observations 160 and 180 and one small loss at the beginning, with an out-of-sample trade success ratio hitting 82 percent and an ROI of just over 4 percent over the 9 day interval.
Figure 9: Out-of-sample performance of the regularized filter on 200 out-of-sample 15 minute returns of the Yen. The filter achieved 4 percent ROI over the 200 observations and an 82 percent trade success ratio.
Compare this with the results achieved in iMetrica using the same MDFA parameter settings. In Figure 10, both the in-sample and out-of-sample performance are shown. The performance is nearly identical.
Figure 10: In-sample and out-of-sample performance of the Yen filter in iMetrica. Nearly identical with performance obtained in R.
Now we take a stab at producing another trading filter for the Yen, only this time we wish to identify only the lowest frequencies to generate a trading signal that trades less often, only seeking the largest cycles. As with the performance of the previous filter, we still wish to target the frequencies that might be responsible to the large close-to-open variations in the price of Yen. To do this, we select our cutoff to be which will effectively keep the largest three spectral peaks intact in the low-pass band of .
For this new filter, we keep things simple by continuing to use the same regularization parameters chosen in the previous filter as they seemed to produce good results out-of-sample. The and expweight customization parameters however need to be adjusted to account for the new noise suppression requirements in the stopband and the phase properties in the smaller passband. Thus I increase the smoothing parameter and decreased the timeliness parameter (which only affects the passband) to account for this change. The new frequency response functions and filter coefficients for this smaller lowpass design are shown below in Figure 11. Notice that the second spectral peak is accounted for and only slightly mollified under the new changes. The coefficients still have the noticeable smoothness and decay at the largest lags.
Figure 11: Frequency response functions of the two filters and their corresponding coefficients.
To test the effectiveness of this new lower trading frequency design, we apply the filter coefficients to the 200 out-of-sample observations of the 15-minute Yen log-returns. The performance is shown below in Figure 12. In this filter, we clearly see that the filter still succeeds in predicting correctly the large close-to-open jumps in the price of the Yen. Only three total losses are observed during the 9 day period. The overall performance is not as appealing as the previous filter design as less amount of trades are made, with a near 2 percent ROI and 76 percent trade success ratio. However, this design could fit the priorities for a trader much more sensitive to transaction costs.
Figure 12: Out-of-sample performance of filter with lower cutoff.
Conclusão.
Verification and cross-validation is important, just as the most interesting man in the world will tell you.
The point of this tutorial was to show some of the main concepts and strategies that I undergo when approaching the problem of building a robust and highly efficient trading signal for any given asset at any frequency. I also wanted to see if I could achieve similar results with the R MDFA package as my iMetrica software package. The results ended up being nearly parallel except for some minor differences. The main points I was attempting to highlight were in first analyzing the periodogram to seek out the important spectral peaks (such as ones associate with close-to-open variations) and to demonstrate how the choice of the cutoff affects the systematic trading. Here’s a quick recap on good strategies and hacks to keep in mind.
Summary of strategies for building trading signal using MDFA in R:
As I mentioned before, the periodogram is your best friend. Apply the cutoff directly after any range of spectral peaks that you want to consider. These peaks are what generate the trades. Utilize a choice of filter length no greater than 1/4. Anything larger is unnecessary. Begin by computing the filter in the mean-square sense, namely without using any customization or regularization and see exactly what needs to be approved upon by viewing the frequency response functions and coefficients for each explanatory series. Good performance of the trading signal in-sample (and even out-of-sample in most cases) is meaningless unless the coefficients have solid robust characteristics in both the frequency domain and the lag domain. I recommend beginning with tweaking the smoothness customization parameter expweight and the lambda_smooth regularization parameters first. Then proceed with only slight adjustments to the lambda_decay parameters. Finally, as a last resort, the lambda customization. I really never bother to look at lambda_cross. It has seldom helped in any significant manner. Since the data we are using to target and build trading signals are log-returns, no need to ever bother with i1 and i2. Those are for the truly advanced and patient signal extractors, and should only be left for those endowed with iMetrica 😉
If you have any questions, or would like the high-frequency Yen data I used in these examples, feel free to contact me and I’ll send them to you. Until next time, happy extracting!

QuantStrat TradeR.
Trading, QuantStrat, R, and more.
Replicating Volatiltiy ETN Returns From CBOE Futures.
This post will demonstrate how to replicate the volatility ETNs (XIV, VXX, ZIV, VXZ) from CBOE futures, thereby allowing any individual to create synthetic ETF returns from before their inception, free of cost.
So, before I get to the actual algorithm, it depends on an update to the term structure algorithm I shared some months back.
In that algorithm, mistakenly (or for the purpose of simplicity), I used calendar days as the time to expiry, when it should have been business days, which also accounts for weekends, and holidays, which are an irritating artifact to keep track of.
So here’s the salient change, in the loop that calculates times to expiry:
The one salient line in particular, is this:
What is this bizdays function? It comes from the bizdays package in R.
There’s also the tradingHolidays. R script, which makes further use of the bizdays package. Here’s what goes on under the hood in tradingHolidays. R, for those that wish to replicate the code:
There are two CSVs that I manually compiled, but will share screenshots of–they are the easter holidays (because they have to be adjusted for turning Sunday to Friday because of Easter Fridays), and the rest of the national holidays.
Here is what the easters csv looks like:
And the nonEasterHolidays, which contains New Year’s Day, MLK Jr. Day, President’s Day, Memorial Day, Independence Day, Labor Day, Thanksgiving Day, and Christmas Day (along with their observed dates) CSV:
Furthermore, we need to adjust for the two days that equities were not trading due to Hurricane Sandy.
So then, the list of holidays looks like this:
So once we have a list of holidays, we use the bizdays package to set the holidays and weekends (Saturday and Sunday) as our non-business days, and use that function to calculate the correct times to expiry.
So, now that we have the updated expiry structure, we can write a function that will correctly replicate the four main volatility ETNs–XIV, VXX, ZIV, and VXZ.
Here’s the English explanation:
VXX is made up of two contracts–the front month, and the back month, and has a certain number of trading days (AKA business days) that it trades until expiry, say, 17. During that timeframe, the front month (let’s call it M1) goes from being the entire allocation of funds, to being none of the allocation of funds, as the front month contract approaches expiry. That is, as a contract approaches expiry, the second contract gradually receives more and more weight, until, at expiry of the front month contract, the second month contract contains all of the funds–just as it *becomes* the front month contract. So, say you have 17 days to expiry on the front month. At the expiry of the previous contract, the second month will have a weight of 17/17–100%, as it becomes the front month. Then, the next day, that contract, now the front month, will have a weight of 16/17 at settle, then 15/17, and so on. That numerator is called dr, and the denominator is called dt.
However, beyond this, there’s a second mechanism that’s responsible for the VXX looking like it does as compared to a basic futures contract (that is, the decay responsible for short volatility’s profits), and that is the “instantaneous” rebalancing. That is, the returns for a given day are today’s settles multiplied by yesterday’s weights, over yesterday’s settles multiplied by yesterday’s weights, minus one. That is, (S_1_t * dr/dt_t-1 + S_2_t * 1-dr/dt_t-1) / (S_1_t-1 * dr/dt_t-1 + S_2_t-1 * 1-dr/dt_t-1) – 1 (I could use a tutorial on LaTeX). So, when you move forward a day, well, tomorrow, today’s weights become t-1. Yet, when were the assets able to be rebalanced? Well, in the ETNs such as VXX and VXZ, the “hand-waving” is that it happens instantaneously. That is, the weight for the front month was 93%, the return was realized at settlement (that is, from settle to settle), and immediately after that return was realized, the front month’s weight shifts from 93%, to, say, 88%. So, say Credit Suisse (that issues these ETNs ), has $10,000 (just to keep the arithmetic and number of zeroes tolerable, obviously there are a lot more in reality) worth of XIV outstanding after immediately realizing returns, it will sell $500 of its $9300 in the front month, and immediately move them to the second month, so it will immediately go from $9300 in M1 and $700 in M2 to $8800 in M1 and $1200 in M2. When did those $500 move? Immediately, instantaneously, and if you like, you can apply Clarke’s Third Law and call it “magically”.
The only exception is the day after roll day, in which the second month simply becomes the front month as the previous front month expires, so what was a 100% weight on the second month will now be a 100% weight on the front month, so there’s some extra code that needs to be written to make that distinction.
That’s the way it works for VXX and XIV. What’s the difference for VXZ and ZIV? It’s really simple–instead of M1 and M2, VXZ uses the exact same weightings (that is, the time remaining on front month vs. how many days exist for that contract to be the front month), uses M4, M5, M6, and M7, with M4 taking dr/dt, M5 and M6 always being 1, and M7 being 1-dr/dt.
In any case, here’s the code.
So, a big thank you goes out to Michael Kapler of Systematic Investor Toolbox for originally doing the replication and providing his code. My code essentially does the same thing, in, hopefully a more commented way.
So, ultimately, does it work? Well, using my updated term structure code, I can test that.
While I’m not going to paste my entire term structure code (again, available here, just update the script with my updates from this post), here’s how you’d run the new function:
And since it returns both the vxx returns and the vxz returns, we can compare them both.
With the result:
Basically, a perfect match.
Let’s do the same thing, with ZIV.
So, rebuilding from the futures does a tiny bit better than the ETN. But the trajectory is largely identical.
That concludes this post. I hope it has shed some light on how these volatility ETNs work, and how to obtain them directly from the futures data published by the CBOE, which are the inputs to my term structure algorithm.
This also means that for institutions interested in trading my strategy, that they can obtain leverage to trade the futures-composite replicated variants of these ETNs, at greater volume.
Obrigado pela leitura.
NOTES: For those interested in a retail subscription strategy to trading volatility, do not hesitate to subscribe to my volatility-trading strategy. For those interested in employing me full-time or for long-term consulting projects, I can be reached on my LinkedIn, or my email: ilya. kipnis@gmail.
(Don’t Get) Contangled Up In Noise.
This post will be about investigating the efficacy of contango as a volatility trading signal.
For those that trade volatility (like me), a term you may see that’s somewhat ubiquitous is the term “contango”. What does this term mean?
Well, simple: it just means the ratio of the second month of VIX futures over the first. The idea being is that when the second month of futures is more than the first, that people’s outlook for volatility is greater in the future than it is for the present, and therefore, the futures are “in contango”, which is most of the time.
Furthermore, those that try to find decent volatility trading ideas may have often seen that futures in contango implies that holding a short volatility position will be profitable.
Is this the case?
Well, there’s an easy way to answer that.
First off, refer back to my post on obtaining free futures data from the CBOE.
Using this data, we can obtain our signal (that is, in order to run the code in this post, run the code in that post).
Now, let’s get our XIV data (again, big thanks to Mr. Helmuth Vollmeier for so kindly providing it.
Now, here’s how this works: as the CBOE doesn’t update its settles until around 9:45 AM EST on the day after (EG a Tuesday’s settle data won’t release until Wednesday at 9:45 AM EST), we have to enter at close of the day after the signal fires. (For those wondering, my subscription strategy uses this mechanism, giving subscribers ample time to execute throughout the day.)
So, let’s calculate our backtest returns. Here’s a stratStats function to compute some summary statistics.
With the following results:
So, this is obviously a disaster. Visual inspection will show devastating, multi-year drawdowns. Using the table. Drawdowns command, we can view the worst ones.
So, the top 3 are horrendous, and then anything above 30% is still pretty awful. A couple of those drawdowns lasted multiple years as well, with a massive length to the trough. 458 trading days is nearly two years, and 364 is about one and a half years. Imagine seeing a strategy be consistently on the wrong side of the trade for nearly two years, and when all is said and done, you’ve lost three-fourths of everything in that strategy.
There’s no sugar-coating this: such a strategy can only be called utter trash.
Let’s try one modification: we’ll require both contango (C2 > C1), and that contango be above its 60-day simple moving average, similar to my VXV/VXMT strategy.
With the results:
So, a Calmar still safely below 1, an Ulcer Performance Index still in the basement, a maximum drawdown that’s long past the point that people will have abandoned the strategy, and so on.
So, even though it was improved, it’s still safe to say this strategy doesn’t perform too well. Even after the large 2007-2008 drawdown, it still gets some things pretty badly wrong, like being exposed to all of August 2017.
While I think there are applications to contango in volatility investing, I don’t think its use is in generating the long/short volatility signal on its own. Rather, I think other indices and sources of data do a better job of that. Such as the VXV/VXMT, which has since been iterated on to form my subscription strategy.
Obrigado pela leitura.
NOTE: I am currently seeking networking opportunities, long-term projects, and full-time positions related to my skill set. My linkedIn profile can be found here.
Comparing Some Strategies from Easy Volatility Investing, and the Table. Drawdowns Command.
This post will be about comparing strategies from the paper “Easy Volatility Investing”, along with a demonstration of R’s table. Drawdowns command.
First off, before going further, while I think the execution assumptions found in EVI don’t lend the strategies well to actual live trading (although their risk/reward tradeoffs also leave a lot of room for improvement), I think these strategies are great as benchmarks.
So, some time ago, I did an out-of-sample test for one of the strategies found in EVI, which can be found here.
Using the same source of data, I also obtained data for SPY (though, again, AlphaVantage can also provide this service for free for those that don’t use Quandl).
Here’s the new code.
So, an explanation: there are four return streams here–buy and hold XIV, the DDN momentum from a previous post, and two other strategies.
The simpler one, called the VRatio is simply the ratio of the VIX over the VXV. Near the close, check this quantity. If this is less than one, buy XIV, otherwise, buy VXX.
The other one, called the Volatility Risk Premium strategy (or VRP for short), compares the 10 day historical volatility (that is, the annualized running ten day standard deviation) of the S&P 500, subtracts it from the VIX, and takes a 5 day moving average of that. Near the close, when that’s above zero (that is, VIX is higher than historical volatility), go long XIV, otherwise, go long VXX.
Again, all of these strategies are effectively “observe near/at the close, buy at the close”, so are useful for demonstration purposes, though not for implementation purposes on any large account without incurring market impact.
Here are the results, since 2018 (that is, around the time of XIV’s actual inception):
To note, both the momentum and the VRP strategy underperform buying and holding XIV since 2018. The VRatio strategy, on the other hand, does outperform.
Here’s a summary statistics function that compiles some top-level performance metrics.
To note, all of the benchmark strategies suffered very large drawdowns since XIV’s inception, which we can examine using the table. Drawdowns command, as seen below:
Note that the table. Drawdowns command only examines one return stream at a time. Furthermore, the top argument specifies how many drawdowns to look at, sorted by greatest drawdown first.
One reason I think that these strategies seem to suffer the drawdowns they do is that they’re either all-in on one asset, or its exact opposite, with no room for error.
One last thing, for the curious, here is the comparison with my strategy since 2018 (essentially XIV inception) benchmarked against the strategies in EVI (which I have been trading with live capital since September, and have recently opened a subscription service for):
Obrigado pela leitura.
NOTE: I am currently looking for networking and full-time opportunities related to my skill set. My LinkedIn profile can be found here.
Launching My Subscription Service.
After gauging interest from my readers, I’ve decided to open up a subscription service. I’ll copy and paste the FAQs, or my best attempt at trying to answer as many questions as possible ahead of time, and may answer more in the future.
I’m choosing to use Patreon just to outsource all of the technicalities of handling subscriptions and creating a centralized source to post subscription-based content.
FAQs (copied from the subscription page):
Obrigado pela visita. After gauging interest from my readership on my main site (quantstrattrader. wordpress), I created this as a subscription page for quantitative investment strategies, with the goal of having subscribers turn their cash into more cash, net of subscription fees (hopefully). The systems I develop come from a background of learning from experienced quantitative trading professionals, and senior researchers at large firms. The current system I initially published a prototype for several years back and watched it being tracked, before finally starting to deploy my own capital earlier this year, and making the most recent modifications even more recently.
And while past performance doesn’t guarantee future results and the past doesn’t repeat itself, it often rhymes, so let’s turn money into more money.
Some FAQs about the strategy:
​What is the subscription price for this strategy?
​Currently, after gauging interest from readers and doing research based on other sites, the tentative pricing is $50/month. As this strategy builds a track record, that may be subject to change in the future, and notifications will be made in such an event.
What is the description of the strategy?
The strategy is mainly a short volatility system that trades XIV, ZIV, and VXX. As far as volatility strategies go, it’s fairly conservative in that it uses several different checks in order to ensure a position.
What is the strategy’s edge?
In two words: risk management. Essentially, there are a few separate criteria to select an investment, and the system spends a not-insignificant time with no exposure when some of these criteria provide contradictory signals. Furthermore, the system uses disciplined methodologies in its construction in order to avoid unnecessary free parameters, and to keep the strategy as parsimonious as possible.
Do you trade your own capital with this strategy?
When was the in-sample training period for this system?
A site that no longer updates its blog (volatility made simple) once tracked a more rudimentary strategy that I wrote about several years ago. I was particularly pleased with the results of that vetting, and recently have received input to improve my system to a much greater degree, as well as gained the confidence to invest live capital into it.
How many trades per year does the system make?
In the backtest from April 20, 2008 through the end of 2018, the system made 187 transactions in XIV (both buy and sell), 160 in ZIV, and 52 in VXX. Meaning over the course of approximately 9 years, there was on average 43 transactions per year. In some cases, this may simply be switching from XIV to ZIV or vice versa. In other words, trades last approximately a week (some may be longer, some shorter).
When will signals be posted?
Signals will be posted sometime between 12 PM and market close (4 PM EST). In backtesting, they are tested as market on close orders, so individuals assume any risk/reward by executing earlier.
How often is this system in the market?
About 56%. However, over the course of backtesting (and live trading), only about 9% of months have zero return.
What are the distribution of winning, losing, and zero return months?
As of late October 2017, there have been about 65% winning months (with an average gain of 12.8%), 26% losing months (with an average loss of 4.9%), and 9% zero months.
What are some other statistics about the strategy?
Since 2018 (around the time that XIV officially came into inception as opposed to using synthetic data), the strategy has boasted an 82% annualized return, with a 24.8% maximum drawdown and an annualized standard deviation of 35%. This means a Sharpe ratio (return to standard deviation) higher than 2, and a Calmar ratio higher than 3. It also has an Ulcer Performance Index of 10.
What are the strategy’s worst drawdowns?
Since 2018 (again, around the time of XIV’s inception), the largest drawdown was 24.8%, starting on October 31, 2018, and making a new equity high on January 12, 2018. The longest drawdown started on August 21, 2018 and recovered on April 10, 2018, and lasted for 160 trading days.
Will the subscription price change in the future?
If the strategy continues to deliver strong returns, then there may be reason to increase the price so long as the returns bear it out.
Can a conservative risk signal be provided for those who might not be able to tolerate a 25% drawdown?
A variant of the strategy that targets about half of the annualized standard deviation of the strategy boasts a 40% annualized return for about 12% drawdown since 2018. Overall, this has slightly higher reward to risk statistics, but at the cost of cutting aggregate returns in half.
Can’t XIV have a termination event?
This refers to the idea of the XIV ETN terminating if it loses 80% of its value in a single day. To give an idea of the likelihood of this event, using synthetic data, the XIV ETN had a massive drawdown of 92% over the course of the 2008 financial crisis. For the history of that synthetic (pre-inception) and realized (post-inception) data, the absolute worst day was a down day of 26.8%. To note, the strategy was not in XIV during that day.
What was the strategy’s worst day?
On September 16, 2018, the strategy lost 16% in one day. This was at the tail end of a stretch of positive days that made about 40%.
What are the strategy’s risks?
The first risk is that given that this strategy is naturally biased towards short volatility, that it can have potential for some sharp drawdowns due to the nature of volatility spikes. The other risk is that given that this strategy sometimes spends its time in ZIV, that it will underperform XIV on some good days. This second risk is a consequence of additional layers of risk management in the strategy.
How complex is this strategy?
Not overly. It’s only slightly more complex than a basic momentum strategy when counting free parameters, and can be explained in a couple of minutes.
Does this strategy use any complex machine learning methodologies?
No. The data requirements for such algorithms and the noise in the financial world make it very risky to apply these methodologies, and research as of yet did not bear fruit to justify incorporating them.
Will instrument volume ever be a concern (particularly ZIV)?
According to one individual who worked on the creation of the original VXX ETN (and by extension, its inverse, XIV), new shares of ETNs can be created by the issuer (in ZIV’s case, Credit Suisse) on demand. In short, the concern of volume is more of a concern of the reputability of the person making the request. In other words, it depends on how well the strategy does.
Can the strategy be held liable/accountable/responsible for a subscriber’s loss/drawdown?
​Let this serve as a disclaimer: by subscribing, you agree to waive any legal claim against the strategy, or its creator(s) in the event of drawdowns, losses, etc. The subscription is for viewing the output of a program, and this service does not actively manage a penny of subscribers’ actual assets. Subscribers can choose to ignore the strategy’s signals at a moment’s notice at their discretion. The program’s output should not be thought of as the investment advice coming from a CFP, CFA, RIA, etc.
Why should these signals be trusted?
Because my work on other topics has been on full, public display for several years. Unlike other websites, I have shown “bad backtests”, thus breaking the adage of “you’ll never see a bad backtest”. I have shown thoroughness in my research, and the same thoroughness has been applied towards this system as well. Until there is a longer track record such that the system can stand on its own, the trust in the system is the trust in the system’s creator.
Who is the intended audience for these signals?
The intended audience is individual, retail investors with a certain risk tolerance, and is priced accordingly.
​Isn’t volatility investing very risky?
​It’s risky from the perspective of the underlying instrument having the capacity to realize very large drawdowns (greater than 60%, and even greater than 90%). However, from a purely numerical standpoint, the company taking over so much of shopping, Amazon, since inception has had a 37.1% annualized rate of return, a standard deviation of 61.5%, a worst drawdown of 94%, and an Ulcer Performance Index of 0.9. By comparison, XIV, from 2008 (using synthetic data), has had a 35.5% annualized rate of return, a standard deviation of 57.7%, a worst drawdown of 92%, and an Ulcer Performance Index of 0.6. If Amazon is considered a top-notch asset, then from a quantitative comparison, a system looking to capitalize on volatility bets should be viewed from a similar perspective. To be sure, the strategy’s performance vastly outperforms that of buying and holding XIV (which nobody should do). However, the philosophy of volatility products being much riskier than household tech names just does not hold true unless the future wildly differs from the past.
​Is there a possibility for collaborating with other strategy creators?
​Feel free to contact me at my email ilya. kipnis@gmail to discuss that possibility. I request a daily stream of returns before starting any discussion.
Because past all the artsy-craftsy window dressing and interesting choice of vocabulary, Patreon is simply a platform that processes payments and creates a centralized platform from which to post subscription-based content, as opposed to maintaining mailing lists and other technical headaches. Essentially, it’s simply a way to outsource the technical end of running a business, even if the window dressing is a bit unorthodox.
Obrigado pela leitura.
NOTE: I am currently interested in networking and full-time roles based on my skills. My LinkedIn profile can be found here.
The Return of Free Data and Possible Volatility Trading Subscription.
This post will be about pulling free data from AlphaVantage, and gauging interest for a volatility trading subscription service.
So first off, ever since the yahoos at Yahoo decided to turn off their free data, the world of free daily data has been in somewhat of a dark age. Well, thanks to blog. fosstrading/2017/10/getsymbols-and-alpha-vantage. html#gpluscommentsJosh Ulrich, Paul Teetor, and other R/Finance individuals, the latest edition of quantmod (which can be installed from CRAN) now contains a way to get free financial data from AlphaVantage since the year 2000, which is usually enough for most backtests, as that date predates the inception of most ETFs.
Aqui, é como fazê-lo.
Once you do that, downloading data is simple, if not slightly slow. Aqui, é como fazê-lo.
And the results:
Which means if any one of my old posts on asset allocation has been somewhat defunct thanks to bad yahoo data, it will now work again with a slight modification to the data input algorithms.
Beyond demonstrating this routine, one other thing I’d like to do is to gauge interest for a volatility signal subscription service, for a system I have personally started trading a couple of months ago.
Simply, I have seen other websites with subscription services with worse risk/reward than the strategy I currently trade, which switches between XIV, ZIV, and VXX. Currently, the equity curve, in log 10, looks like this:
That is, $1000 in 2008 would have become approximately $1,000,000 today, if one was able to trade this strategy since then.
Since 2018 (around the time of inception for XIV), the performance has been:
Considering that some websites out there charge upwards of $50 a month for either a single tactical asset rotation strategy (and a lot more for a combination) with inferior risk/return profiles, or a volatility strategy that may have had a massive and historically record-breaking drawdown, I was hoping to gauge a price point for what readers would consider paying for signals from a better strategy than those.
Obrigado pela leitura.
NOTE: I am currently interested in networking and am seeking full-time opportunities related to my skill set. My LinkedIn profile can be found here.
The Kelly Criterion — Funciona?
This post will be about implementing and investigating the running Kelly Criterion — that is, a constantly adjusted Kelly Criterion that changes as a strategy realizes returns.
For those not familiar with the Kelly Criterion, it’s the idea of adjusting a bet size to maximize a strategy’s long term growth rate. Both https://en. wikipedia/wiki/Kelly_criterionWikipedia and Investopedia have entries on the Kelly Criterion. Essentially, it’s about maximizing your long-run expectation of a betting system, by sizing bets higher when the edge is higher, and vice versa.
There are two formulations for the Kelly criterion: the Wikipedia result presents it as mean over sigma squared. The Investopedia definition is P-[(1-P)/winLossRatio], where P is the probability of a winning bet, and the winLossRatio is the average win over the average loss.
In any case, here are the two implementations.
Let’s try this with some data. At this point in time, I’m going to show a non-replicable volatility strategy that I currently trade.
For the record, here are its statistics:
Now, let’s see what the Wikipedia version does:
The results are simply ridiculous. And here would be why: say you have a mean return of .0005 per day (5 bps/day), and a standard deviation equal to that (that is, a Sharpe ratio of 1). You would have 1/.0005 = 2000. In other words, a leverage of 2000 times. This clearly makes no sense.
The other variant is the more particular Investopedia definition.
Looks a bit more reasonable. However, how does it stack up against not using it at all?
Turns out, the fabled Kelly Criterion doesn’t really change things all that much.
For the record, here are the statistical comparisons:
Obrigado pela leitura.
NOTE: I am currently looking for my next full-time opportunity, preferably in New York City or Philadelphia relating to the skills I have demonstrated on this blog. My LinkedIn profile can be found here. If you know of such opportunities, do not hesitate to reach out to me.
Leverage Up When You’re Down?
This post will investigate the idea of reducing leverage when drawdowns are small, and increasing leverage as losses accumulate. It’s based on the idea that whatever goes up must come down, and whatever comes down generally goes back up.
I originally came across this idea from this blog post.
So, first off, let’s write an easy function that allows replication of this idea. Essentially, we have several arguments:
One: the default leverage (that is, when your drawdown is zero, what’s your exposure)? For reference, in the original post, it’s 10%.
Next: the various leverage levels. In the original post, the leverage levels are 25%, 50%, and 100%.
And lastly, we need the corresponding thresholds at which to apply those leverage levels. In the original post, those levels are 20%, 40%, and 55%.
So, now we can create a function to implement that in R. The idea being that we have R compute the drawdowns, and then use that information to determine leverage levels as precisely and frequently as possible.
Here’s a quick piece of code to do so:
So, let’s replicate some results.
And our results look something like this:
That said, what would happen if one were to extend the data for all available XIV data?
A different story.
In this case, I think the takeaway is that such a mechanism does well when the drawdowns for the benchmark in question occur sharply, so that the lower exposure protects from those sharp drawdowns, and then the benchmark spends much of the time in a recovery mode, so that the increased exposure has time to earn outsized returns, and then draws down again. When the benchmark continues to see drawdowns after maximum leverage is reached, or continues to perform well when not in drawdown, such a mechanism falls behind quickly.
As always, there is no free lunch when it comes to drawdowns, as trying to lower exposure in preparation for a correction will necessarily mean forfeiting a painful amount of upside in the good times, at least as presented in the original post.
Obrigado pela leitura.
NOTE: I am currently looking for my next full-time opportunity, preferably in New York City or Philadelphia relating to the skills I have demonstrated on this blog. My LinkedIn profile can be found here. If you know of such opportunities, do not hesitate to reach out to me.
Let’s Talk Drawdowns (And Affiliates)
This post will be directed towards those newer in investing, with an explanation of drawdowns–in my opinion, a simple and highly important risk statistic.
Would you invest in this?
As it turns out, millions of people do, and did. That is the S&P 500, from 2000 through 2018, more colloquially referred to as “the stock market”. Plenty of people around the world invest in it, and for a risk to reward payoff that is very bad, in my opinion. This is an investment that, in ten years, lost half of its value–twice!
At its simplest, an investment–placing your money in an asset like a stock, a savings account, and so on, instead of spending it, has two things you need to look at.
First, what’s your reward? If you open up a bank CD, you might be fortunate to get 3%. If you invest it in the stock market, you might get 8% per year (on average) if you held it for 20 years. In other words, you stow away $100 on January 1st, and you might come back and find $108 in your account on December 31st. This is often called the compound annualized growth rate (CAGR)–meaning that if you have $100 one year, earn 8%, you have 108, and then earn 8% on that, and so on.
The second thing to look at is the risk. What can you lose? The simplest answer to this is “the maximum drawdown”. If this sounds complicated, it simply means “the biggest loss”. So, if you had $100 one month, $120 next month, and $90 the month after that, your maximum drawdown (that is, your maximum loss) would be 1 – 90/120 = 25%.
When you put the reward and risk together, you can create a ratio, to see how your rewards and risks line up. This is called a Calmar ratio, and you get it by dividing your CAGR by your maximum drawdown. The Calmar Ratio is a ratio that I interpret as “for every dollar you lose in your investment’s worst performance, how many dollars can you make back in a year?” For my own investments, I prefer this number to be at least 1, and know of a strategy for which that number is above 2 since 2018, or higher than 3 if simulated back to 2008.
Most stocks don’t even have a Calmar ratio of 1, which means that on average, an investment makes more than it can possibly lose in a year. Even Amazon, the company whose stock made Jeff Bezos now the richest man in the world, only has a Calmar Ratio of less than 2/5, with a maximum loss of more than 90% in the dot-com crash. The S&P 500, again, “the stock market”, since 1993, has a Calmar Ratio of around 1/6. That is, the worst losses can take *years* to make back.
A lot of wealth advisers like to say that they recommend a large holding of stocks for young people. In my opinion, whether you’re young or old, losing half of everything hurts, and there are much better ways to make money than to simply buy and hold a collection of stocks.
For those with coding skills, one way to gauge just how good or bad an investment is, is this:
An investment has a history–that is, in January, it made 3%, in February, it lost 2%, in March, it made 5%, and so on. By shuffling that history around, so that say, January loses 2%, February makes 5%, and March makes 3%, you can create an alternate history of the investment. It will start and end in the same place, but the journey will be different. For investments that have existed for a few years, it is possible to create many different histories, and compare the Calmar ratio of the original investment to its shuffled “alternate histories”. Ideally, you want the investment to be ranked among the highest possible ways to have made the money it did.
To put it simply: would you rather fall one inch a thousand times, or fall a thousand inches once? Well, the first one is no different than jumping rope. The second one will kill you.
Here is some code I wrote in R (if you don’t code in R, don’t worry) to see just how the S&P 500 (the stock market) did compared to how it could have done.
This is the resulting plot:
That red line is the actual performance of the S&P 500 compared to what could have been. And of the 1000 different simulations, only 91 did worse than what happened in reality.
This means that the stock market isn’t a particularly good investment, and that you can do much better using tactical asset allocation strategies.
One site I’m affiliated with, is AllocateSmartly. It is a cheap investment subscription service ($30 a month) that compiles a collection of asset allocation strategies that perform better than many wealth advisers. When you combine some of those strategies, the performance is better still. To put it into perspective, one model strategy I’ve come up with has this performance:
In this case, the compound annualized growth rate is nearly double that of the maximum loss. For those interested in something a bit more aggressive, this strategy ensemble uses some fairly conservative strategies in its approach.
In conclusion, when considering how to invest your money, keep in mind both the reward, and the risk. One very simple and important way to understand risk is how much an investment can possibly lose, from its highest, to its lowest value following that peak. When you combine the reward and the risk, you can get a ratio that tells you about how much you can stand to make for every dollar lost in an investment’s worst performance.
Obrigado pela leitura.
NOTE: I am interested in networking opportunities, projects, and full-time positions related to my skill set. If you are looking to collaborate, please contact me on my LinkedIn here.
An Out of Sample Update on DDN’s Volatility Momentum Trading Strategy and Beta Convexity.
The first part of this post is a quick update on Tony Cooper’s of Double Digit Numerics’s volatility ETN momentum strategy from the volatility made simple blog (which has stopped updating as of a year and a half ago). The second part will cover Dr. Jonathan Kinlay’s Beta Convexity concept.
So, now that I have the ability to generate a term structure and constant expiry contracts, I decided to revisit some of the strategies on Volatility Made Simple and see if any of them are any good (long story short: all of the publicly detailed ones aren’t so hot besides mine–they either have a massive drawdown in-sample around the time of the crisis, or a massive drawdown out-of-sample).
Why this strategy? Because it seemed different from most of the usual term structure ratio trades (of which mine is an example), so I thought I’d check out how it did since its first publishing date, and because it’s rather easy to understand.
Here’s the strategy:
Take XIV, VXX, ZIV, VXZ, and SHY (this last one as the “risk free” asset), and at the close, invest in whichever has had the highest 83 day momentum (this was the result of optimization done on volatilityMadeSimple).
Here’s the code to do this in R, using the Quandl EOD database. There are two variants tested–observe the close, buy the close (AKA magical thinking), and observe the close, buy tomorrow’s close.
Aqui estão os resultados.
Looks like this strategy didn’t pan out too well. Just a daily reminder that if you’re using fine grid-search to select a particularly good parameter (EG n = 83 days? Maybe 4 21-day trading months, but even that would have been n = 82), you’re asking for a visit from, in the words of Mr. Tony Cooper, a visit from the grim reaper.
Moving onto another topic, whenever Dr. Jonathan Kinlay posts something that I think I can replicate that I’d be very wise to do so, as he is a very skilled and experienced practitioner (and also includes me on his blogroll).
A topic that Dr. Kinlay covered is the idea of beta convexity–namely, that an asset’s beta to a benchmark may be different when the benchmark is up as compared to when it’s down. Essentially, it’s the idea that we want to weed out firms that are what I’d deem as “losers in disguise”–I. E. those that act fine when times are good (which is when we really don’t care about diversification, since everything is going up anyway), but do nothing during bad times.
The beta convexity is calculated quite simply: it’s the beta of an asset to a benchmark when the benchmark has a positive return, minus the beta of an asset to a benchmark when the benchmark has a negative return, then squaring the difference. That is, (beta_bench_positive – beta_bench_negative) ^ 2.
Here’s some R code to demonstrate this, using IBM vs. the S&P 500 since 1995.
Obrigado pela leitura.
NOTE: I am always looking to network, and am currently actively looking for full-time opportunities which may benefit from my skill set. If you have a position which may benefit from my skills, do not hesitate to reach out to me. My LinkedIn profile can be found here.
Testing the Hierarchical Risk Parity algorithm.
This post will be a modified backtest of the Adaptive Asset Allocation backtest from AllocateSmartly, using the Hierarchical Risk Parity algorithm from last post, because Adam Butler was eager to see my results. On a whole, as Adam Butler had told me he had seen, HRP does not generate outperformance when applied to a small, carefully-constructed, diversified-by-selection universe of asset classes, as opposed to a universe of hundreds or even several thousand assets, where its theoretically superior properties result in it being a superior algorithm.
First off, I would like to thank one Matthew Barry, for helping me modify my HRP algorithm so as to not use the global environment for recursion. You can find his github here.
Here is the modified HRP code.
With covMat and corMat being from the last post. In fact, this function can be further modified by encapsulating the clustering order within the getRecBipart function, but in the interest of keeping the code as similar to Marcos Lopez de Prado’s code as I could, I’ll leave this here.
Anyhow, the backtest will follow. One thing I will mention is that I’m using Quandl’s EOD database, as Yahoo has really screwed up their financial database (I. E. some sector SPDRs have broken data, dividends not adjusted, etc.). While this database is a $50/month subscription, I believe free users can access it up to 150 times in 60 days, so that should be enough to run backtests from this blog, so long as you save your downloaded time series for later use by using write. zoo.
This code needs the tseries library for the portfolio. optim function for the minimum variance portfolio (Dr. Kris Boudt has a course on this at datacamp), and the other standard packages.
A helper function for this backtest (and really, any other momentum rotation backtest) is the appendMissingAssets function, which simply adds on assets not selected to the final weighting and re-orders the weights by the original ordering.
Next, we make the call to Quandl to get our data.
While Josh Ulrich fixed quantmod to actually get Yahoo data after Yahoo broke the API, the problem is that the Yahoo data is now garbage as well, and I’m not sure how much Josh Ulrich can do about that. I really hope some other provider can step up and provide free, usable EOD data so that I don’t have to worry about readers not being able to replicate the backtest, as my policy for this blog is that readers should be able to replicate the backtests so they don’t just nod and take my word for it. If you are or know of such a provider, please leave a comment so that I can let the blog readers know all about you.
Next, we initialize the settings for the backtest.
While the AAA backtest actually uses a 126 day lookback instead of a 6 month lookback, as it trades at the end of every month, that’s effectively a 6 month lookback, give or take a few days out of 126, but the code is less complex this way.
Next, we have our actual backtest.
In a few sentences, this is what happens:
The algorithm takes a subset of the returns (the past six months at every month), and computes absolute momentum. It then ranks the ten absolute momentum calculations, and selects the intersection of the top 5, and those with a return greater than zero (so, a dual momentum calculation).
If no assets qualify, the algorithm invests in nothing. If there’s only one asset that qualifies, the algorithm invests in that one asset. If there are two or more qualifying assets, the algorithm computes a covariance matrix using 20 day volatility multiplied with a 126 day correlation matrix (that is, sd_20′ %*% sd_20 * (elementwise) cor_126. It then computes normalized inverse volatility weights using the volatility from the past 20 days, a minimum variance portfolio with the portfolio. optim function, and lastly, the hierarchical risk parity weights using the HRP code above from Marcos Lopez de Prado’s paper.
Lastly, the program puts together all of the weights, and adds a cash investment for any period without any investments.
Aqui estão os resultados:
In short, in the context of a small, carefully-selected and allegedly diversified (I’ll let Adam Butler speak for that one) universe dominated by the process of which assets to invest in as opposed to how much, the theoretical upsides of an algorithm which simultaneously exploits a covariance structure without needing to invert a covariance matrix can be lost.
However, this test (albeit from 2007 onwards, thanks to ETF inception dates combined with lookback burn-in) confirms what Adam Butler himself told me, which is that HRP hasn’t impressed him, and from this backtest, I can see why. However, in the context of dual momentum rank selection, I’m not convinced that any weighting scheme will realize much better performance than any other.
Obrigado pela leitura.
NOTE: I am always interested in networking and hearing about full-time opportunities related to my skill set. My linkedIn profile can be found here.

The R Trader.
Using R and related tools in Quantitative Finance.
Visualizing Time Series Data in R.
I’m very pleased to announce my DataCamp course on Visualizing Time Series Data in R. This course is also part of the Time Series with R skills track. Feel free to have a look, the first chapter is free!
Descrição do Curso.
As the saying goes, “A chart is worth a thousand words”. This is why visualization is the most used and powerful way to get a better understanding of your data. After this course you will have a very good overview of R time series visualization capabilities and you will be able to better decide which model to choose for subsequent analysis. You will be able to also convey the message you want to deliver in an efficient and beautiful way.
Esboço de Curso.
Chapter 1: R Time Series Visualization Tools.
This chapter will introduce you to basic R time series visualization tools.
Chapter 2: Univariate Time Series.
Univariate plots are designed to learn as much as possible about the distribution, central tendency and spread of the data at hand. In this chapter you will be presented with some visual tools used to diagnose univariate times series.
Chapter 3: Multivariate Time Series.
What to do if you have to deal with multivariate time series? In this chapter, you will learn how to identify patterns in the distribution, central tendency and spread over pairs or groups of data.
Chapter 4: Case study: Visually selecting a stock that improves your existing portfolio.
Let’s put everything you learned so far in practice! Imagine you already own a portfolio of stocks and you have some spare cash to invest, how can you wisely select a new stock to invest your additional cash? Analyzing the statistical properties of individual stocks vs. an existing portfolio is a good way of approaching the problem.
Linking R to IQFeed with the QuantTools package.
IQFeed provides streaming data services and trading solutions that cover the Agricultural, Energy and Financial marketplace. It is a well known and recognized data feed provider geared toward retail users and small institutions. The subscription price starts at around $80/month.
Stanislav Kovalevsky has developed a package called QuantTools. It is an all in one package designed to enhance quantitative trading modelling. It allows to download and organize historical market data from multiple sources like Yahoo, Google, Finam, MOEX and IQFeed. The feature that interests me the most is the ability to link IQFeed to R. I’ve been using IQFeed for a few years and I’m happy with it (I’m not affiliated to the company in any way). Mais informações podem ser encontradas aqui. I’ve been looking for an integration within R for a while and here it is. As a result, after I ran a few tests, I moved my code that was still in Python into R. Just for completeness, here’s a link that explains how to download historical data from IQFeed using Python.
QuantTools offers four main functionalities: Get market data, Store/Retrieve market data, Plot time series data and Back testing.
First make sure that IQfeed is open. You can either download daily or intraday data. The below code downloads daily prices (Open, High, Low, Close) for SPY from 1st Jan 2017 to 1st June 2017.
The below code downloads intraday data from 1st May 2017 to 3rd May 2017.
Note the period parameter. It can take any of the following values: tick, 1min, 5min, 10min, 15min, 30min, hour, day, week, month, depending on the frequency you need.
QuantTools makes the process of managing and storing tick market data easy. You just setup storage parameters and you are ready to go. The parameters are where, since what date and which symbols you would like to be stored. Any time you can add more symbols and if they are not present in a storage, QuantTools tries to get the data from specified start date. The code below will save the data in the following directory: “C:/Users/Arnaud/Documents/Market Data/iqfeed”. There is one sub folder by instrument and the data is aved in. rds files.
You can also store data between specific dates. Replace the last line of code above with one of the below.
Now should you want to get back some of the data you stored, just run something like:
Note that only ticks are supported in local storage so period must be ‘tick’
QuantTools provides plot_ts function to plot time series data without weekend, holidays and overnight gaps. In the example below, I first retrieve the data stored above, then select the first 100 price observations and finally draw the chart.
Two things to notice: First spy is a data. table object hence the syntax above. To get a quick overview of data. table capabilities have a look at this excellent cheat sheet from DataCamp. Second the local parameter is TRUE as the data is retrieved from internal storage.
QuantTools allows to write your own trading strategy using its C++ API. I’m not going to elaborate on this as this is basically C++ code. You can refer to the Examples section on QuantTools website.
Overall I find the package extremely useful and well documented. The only missing bit is the live feed between R and IQFeed which will make the package a real end to end solution.
As usual any comments welcome.
BERT: a newcomer in the R Excel connection.
A few months ago a reader point me out this new way of connecting R and Excel. I don’t know for how long this has been around, but I never came across it and I’ve never seen any blog post or article about it. So I decided to write a post as the tool is really worth it and before anyone asks, I’m not related to the company in any way.
BERT stands for Basic Excel R Toolkit. It’s free (licensed under the GPL v2) and it has been developed by Structured Data LLC. At the time of writing the current version of BERT is 1.07. Mais informações podem ser encontradas aqui. From a more technical perspective, BERT is designed to support running R functions from Excel spreadsheet cells. In Excel terms, it’s for writing User-Defined Functions (UDFs) in R.
In this post I’m not going to show you how R and Excel interact via BERT. There are very good tutorials here, here and here. Instead I want to show you how I used BERT to build a “control tower” for my trading.
My trading signals are generated using a long list of R files but I need the flexibility of Excel to display results quickly and efficiently. As shown above BERT can do this for me but I also want to tailor the application to my needs. By combining the power of XML, VBA, R and BERT I can create a good looking yet powerful application in the form of an Excel file with minimum VBA code. Ultimately I have a single Excel file gathering all the necessary tasks to manage my portfolio: database update, signal generation, orders submission etc… My approach could be broken down in the 3 steps below:
Use XML to build user defined menus and buttons in an Excel file. The above menus and buttons are essentially calls to VBA functions. Those VBA functions are wrapup around R functions defined using BERT.
With this approach I can keep a clear distinction between the core of my code kept in R, SQL and Python and everything used to display and format results kept in Excel, VBA & XML. In the next sections I present the prerequisite to developed such an approach and a step by step guide that explains how BERT could be used for simply passing data from R to Excel with minimal VBA code.
1 & # 8211; Download and install BERT from this link . Once the installation has completed you should have a new Add-Ins menu in Excel with the buttons as shown below. This is how BERT materialized in Excel.
2 & # 8211; Download and install Custom UI editor : The Custom UI Editor allows to create user defined menus and buttons in Excel ribbon. A step by step procedure is available here.
1 & # 8211; R Code: The below R function is a very simple piece of code for illustration purposes only. It calculates and return the residuals from a linear regression. This is what we want to retrieve in Excel. Save this in a file called myRCode. R (any other name is fine) in a directory of your choice.
2 & # 8211; functions. R in BERT : From Excel select Add-Ins -> Home Directory and open the file called functions. R . In this file paste the following code. Make sure you insert the correct path.
This is just sourcing into BERT the R file you created above. Then save and close the file functions. R. Should you want to make any change to the R file created in step 1 you will have to reload it using the BERT button “Reload Startup File” from the Add-Ins menu in Excel.
3 & # 8211; In Excel: Create and save a file called myFile. xslm (any other name is fine). This is a macro-enabled file that you save in the directory of your choice. Once the file is saved close it.
4 & # 8211; Open the file created above in Custom UI editor : Once the file is open, paste the below code.
You should have something like this in the XML editor:
Essentially this piece of XML code creates an additional menu (RTrader), a new group (My Group) and a user defined button (New Button) in the Excel ribbon. Once you’re done, open myFile. xslm in Excel and close the Custom UI Editor. You should see something like this.
5 & ​​# 8211; Open VBA editor : In myFile. xlsm insert a new module. Paste the code below in the newly created module.
This erases previous results in the worksheet prior to coping new ones.
6 & # 8211; Click New Button : Now go back to the spreadsheet and in the RTrader menu click the “New Button” botão. You should see something like the below appearing.
The guide above is a very basic version of what can be achieved using BERT but it shows you how to combine the power of several specific tools to build your own custom application. From my perspective the interest of such an approach is the ability to glue together R and Excel obviously but also to include via XML (and batch) pieces of code from Python, SQL and more. This is exactly what I needed. Finally I would be curious to know if anyone has any experience with BERT?
Trading strategy: Making the most of the out of sample data.
When testing trading strategies a common approach is to divide the initial data set into in sample data: the part of the data designed to calibrate the model and out of sample data: the part of the data used to validate the calibration and ensure that the performance created in sample will be reflected in the real world. As a rule of thumb around 70% of the initial data can be used for calibration (i. e. in sample) and 30% for validation (i. e. out of sample). Then a comparison of the in and out of sample data help to decide whether the model is robust enough. This post aims at going a step further and provides a statistical method to decide whether the out of sample data is in line with what was created in sample.
In the chart below the blue area represents the out of sample performance for one of my strategies.
A simple visual inspection reveals a good fit between the in and out of sample performance but what degree of confidence do I have in this? At this stage not much and this is the issue. What is truly needed is a measure of similarity between the in and out of sample data sets. In statistical terms this could be translated as the likelihood that the in and out of sample performance figures coming from the same distribution. There is a non-parametric statistical test that does exactly this: the Kruskall-Wallis Test . A good definition of this test could be found on R-Tutor “A collection of data samples are independent if they come from unrelated populations and the samples do not affect each other. Using the Kruskal-Wallis Test , we can decide whether the population distributions are identical without assuming them to follow the normal distribution.” The added benefit of this test is not assuming a normal distribution.
It exists other tests of the same nature that could fit into that framework. The Mann-Whitney-Wilcoxon test or the Kolmogorov-Smirnov tests would perfectly suits the framework describes here however this is beyond the scope of this article to discuss the pros and cons of each of these tests. A good description along with R examples can be found here.
Here’s the code used to generate the chart above and the analysis:
In the example above the in sample period is longer than the out of sample period therefore I randomly created 1000 subsets of the in sample data each of them having the same length as the out of sample data. Then I tested each in sample subset against the out of sample data and I recorded the p-values. This process creates not a single p-value for the Kruskall-Wallis test but a distribution making the analysis more robust. In this example the mean of the p-values is well above zero (0.478) indicating that the null hypothesis should be accepted: there are strong evidences that the in and out of sample data is coming from the same distribution.
As usual what is presented in this post is a toy example that only scratches the surface of the problem and should be tailored to individual needs. However I think it proposes an interesting and rational statistical framework to evaluate out of sample results.
This post is inspired by the following two papers:
Vigier Alexandre, Chmil Swann (2007), “Effects of Various Optimization Functions on the Out of Sample Performance of Genetically Evolved Trading Strategies”, Forecasting Financial Markets Conference.
Vigier Alexandre, Chmil Swann (2018), « An optimization process to improve in/out of sample consistency, a Stock Market case», JP Morgan Cazenove Equity Quantitative Conference, London October 2018.
Introducing fidlr: FInancial Data LoadeR.
fidlr is an RStudio addin designed to simplify the financial data downloading process from various providers. This initial version is a wrapper around the getSymbols function in the quantmod package and only Yahoo, Google, FRED and Oanda are supported. I will probably add functionalities over time. As usual with those things just a kind reminder: “THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND…”
How to install and use fidlr?
You can get the addin/package from its Github repository here (I will register it on CRAN later on) Install the addin. There is an excellent tutorial to install RStudio Addins here. Once the addin is installed it should appear in the Addin menu. Just chose fidlr in the menu and a window as pictured below should appear. Choose a data provider from the the Source dropdown menu. Select a date range from the Date menu Enter the symbol you wish to download in the instrument text box. To download several symbols just enter the symbols separated by commas. Use the Radio buttons to choose whether you want to download the instrument in a csv file or in the global environment. The csv file will be saved in the working directory and there will be one csv file per instrument. Press Run to get the data or Close to close down the addin.
Error messages and warnings are handled by the underlying packages (quantmod and Shiny) and can be read from the console.
This is a very first version of the project so do not expect perfection but hopefully it will get better over time. Please report any comment, suggestion, bug etc… to: thertrader@gmail.
Maintaining a database of price files in R.
Doing quantitative research implies a lot of data crunching and one needs clean and reliable data to achieve this. What is really needed is clean data that is easily accessible (even without an internet connection). The most efficient way to do this for me has been to maintain a set of csv files. Obviously this process can be handled in many ways but I found very efficient and simple overtime to maintain a directory where I store and update csv files. I have one csv file per instrument and each file is named after the instrument it contains. The reason I do so is twofold: First, I don’t want to download (price) data from Yahoo, Google etc… every time I want to test a new idea but more importantly once I identified and fixed a problem, I don’t want to have to do it again the next time I need the same instrument. Simple yet very efficient so far. The process is summarized in the chart below.
In everything that follows, I assume that data is coming from Yahoo. The code will have to be amended for data from Google, Quandl etc… In addition I present the process of updating daily price data. The setup will be different for higher frequency data and other type of dataset (i. e. different from prices).
1 & # 8211; Initial data downloading (listOfInstruments. R & historicalData. R)
The file listOfInstruments. R is a file containing only the list of all instruments.
If an instrument isn’t part of my list (i. e. no csv file in my data folder) or if you do it for the very first time you have to download the initial historical data set. The example below downloads a set of ETFs daily prices from Yahoo Finance back to January 2000 and store the data in a csv file.
2 & # 8211; Update existing data (updateData. R)
The below code starts from existing files in the dedicated folder and updates all of them one after the other. I usually run this process everyday except when I’m on holiday. To add a new instrument, simply run step 1 above for this instrument alone.
3 & # 8211; Create a batch file (updateDailyPrices. bat)
Another important part of the job is creating a batch file that automates the updating process above (I’m a Windows user). This avoids opening R/RStudio and run the code from there. The code below is placed on a. bat file (the path has to be amended with the reader’s setup). Note that I added an output file (updateLog. txt) to track the execution.
The process above is extremely simple because it only describes how to update daily price data. I’ve been using this for a while and it has been working very smoothly for me so far. For more advanced data and/or higher frequencies, things can get much trickier.
As usual any comments welcome.
The Rise of the Robots (Advisors…)
The Asset Management industry is on the verge of a major change. Over the last couple of years Robots Advisors (RA) have emerged as new players. The term itself is hard to define as it encompasses a large variety of services. Some are designed to help traditional advisers to better allocate their clients money and some are real “black box”. The user enter a few criteria (age , income, children etc…) and the robot proposes a tailor-made allocation. Between those two extremes a full range of offers is available. I found the Wikipedia definition pretty good. “They are a class of financial adviser that provides portfolio management online with minimal human intervention”. More precisely they use algorithm-based portfolio management to offer the full spectrum of services a traditional adviser would offer: dividend reinvesting, compliance reports, portfolio rebalancing, tax loss harvesting etc… (well this is what the quantitative investment community is doing for decades!). The industry is still in its infancy with most players still managing a small amount of money but I only realised how profound the change was when I was in NYC a few days ago. When RA get their names on TV adds or on the roof of NYC cab you know something big is happening…
it is getting more and more attention from the media and above all it makes a lot of sense from an investor perspective. There are actually two main advantages in using RA:
Significantly lower fees over traditional advisers Investment is made more transparent and simpler which is more appealing to people with limited financial knowledge.
In this post R is just an excuse to present nicely what is a major trend in the asset management industry. The chart below shows the market shares of most popular RA as of the end of 2018. The code used to generate the chart below can be found at the end of this post and the data is here.
Those figures are a bit dated given how fast this industry evolves but are still very informative. Not surprisingly the market is dominated by US providers like Wealthfront and Betterment but RA do emerge all over the world: Asia (8Now!), Switzerland (InvestGlass), France (Marie Quantier)….. It is starting to significantly affect the way traditional asset managers are doing business. A prominent example is the partnership between Fidelity and Betterment. Since December 2018 Betterment past the $2 billion AUM mark.
Despite all the above, I think the real change is ahead of us. Because they use less intermediaries and low commission products (like ETFs) they charge much lower fees than traditional advisers. RA will certainly gain significant market shares but they will also lowers fees charged by the industry as a whole. Ultimately it will affect the way traditional investment firms do business. Active portfolio management which is having a tough time for some years now will suffer even more. The high fees it charges will be even harder to justify unless it reinvents itself. Another potential impact is the rise of ETFs and low commission financial products in general. Obviously this has started a while ago but I do think the effect will be even more pronounced in the coming years. New generations of ETFs track more complex indices and custom made strategies. This trend will get stronger inevitably.
As usual any comments welcome.
R financial time series tips everyone should know about.
There are many R time series tutorials floating around on the web this post is not designed to be one of them. Instead I want to introduce a list of the most useful tricks I came across when dealing with financial time series in R. Some of the functions presented here are incredibly powerful but unfortunately buried in the documentation hence my desire to create a dedicated post. I only address daily or lower frequency times series. Dealing with higher frequency data requires specific tools: data. table or highfrequency packages are some of them.
xts : The xts package is the must have when it comes to times series in R. The example below loads the package and creates a daily time series of 400 days normaly distributed returns.
merge. xts (package xts): This is incredibly powerful when it comes to binding two or more times series together whether they have the same length or not. The join argument does the magic! it determines how the binding is done.
apply. yearly/apply. monthly (package xts): Apply a specified function to each distinct period in a given time series object. The example below calculates monthly and yearly returns of the second series in the tsInter object. Note that I use the sum of returns (no compounding)
endpoints (package xts): Extract index values of a given xts object corresponding to the last observations given a period specified by on. The example gives the last day of the month returns for each series in the tsInter object using endpoint to select the date.
na. locf (package zoo): Generic function for replacing each NA with the most recent non-NA prior to it. Extremely useful when dealing with a time series with a few “holes” and when this time series is subsequently used as input for an R functions that does not accept arguments with NAs. In the example I create a time series of random prices then artificially includes a few NAs in it and replace them with the most recent value.
charts. PerformanceSummary (package PerformanceAnalytics): For a set of returns, create a wealth index chart, bars for per-period performance, and underwater chart for drawdown. This is incredibly useful as it displays on a single window all the relevant information for a quick visual inspection of a trading strategy. The example below turns the prices series into an xts object then displays a window with the 3 charts described above.
The list above is by no means exhaustive but once you master the functions describe in this post it makes the manipulation of financial time series a lot easier, the code shorter and the readability of the code better.
As usual any comments welcome.
Factor Evaluation in Quantitative Portfolio Management.
When it comes to managing a portfolio of stocks versus a benchmark the problem is very different from defining an absolute return strategy. In the former one has to hold more stocks than in the later where no stocks at all can be held if there is not good enough opportunity. The reason for that is the tracking error . This is defined as the standard deviation of the portfolio return minus the benchmark return. The less stocks is held vs. a benchmark the higher the tracking error (e. g higher risk).
The analysis that follows is largely inspired by the book “Active Portfolio Management” by Grinold & Kahn. This is the bible for anyone interested in running a portfolio against a benchmark. I strongly encourage anyone with an interest in the topic to read the book from the beginning to the end. It’s very well written and lays the foundations of systematic active portfolio management (I have no affiliation to the editor or the authors).
Here we’re trying to rank as accurately as possible the stocks in the investment universe on a forward return basis. Many people came up with many tools and countless variant of those tools have been developed to achieve this. In this post I focus on two simple and widely used metrics: Information Coefficient (IC) and Quantiles Return (QR).
The IC gives an overview of the factor forecasting ability. More precisely, this is a measure of how well the factor ranks the stocks on a forward return basis. The IC is defined as the rank correlation ( ρ ) between the metric (e. g. factor) and the forward return. In statistical terms the rank correlation is a nonparametric measure of dependance between two variables. For a sample of size n , the n raw scores are converted to ranks , and ρ is computed from:
The horizon for the forward return has to be defined by the analyst and it’s a function of the strategy’s turnover and the alpha decay (this has been the subject of extensive research). Obviously ICs must be as high as possible in absolute terms.
For the keen reader, in the book by Grinold & Kahn a formula linking Information Ratio (IR) and IC is given: with breadth being the number of independent bets (trades). This formula is known as the fundamental law of active management . The problem is that often, defining breadth accurately is not as easy as it sounds.
In order to have a more accurate estimate of the factor predictive power it’s necessary to go a step further and group stocks by quantile of factor values then analyse the average forward return (or any other central tendency metric) of each of those quantiles. The usefulness of this tool is straightforward. A factor can have a good IC but its predictive power might be limited to a small number of stocks. This is not good as a portfolio manager will have to pick stocks within the entire universe in order to meet its tracking error constraint. Good quantiles return are characterised by a monotonous relationship between the individual quantiles and forward returns.
All the stocks in the S&P500 index (at the time of writing). Obviously there is a survival ship bias: the list of stocks in the index has changed significantly between the start and the end of the sample period, however it’s good enough for illustration purposes only.
The code below downloads individual stock prices in the S&P500 between Jan 2005 and today (it takes a while) and turns the raw prices into return over the last 12 months and the last month. The former is our factor, the latter will be used as the forward return measure.
Below is the code to compute Information Coefficient and Quantiles Return. Note that I used quintiles in this example but any other grouping method (terciles, deciles etc…) can be used. it really depends on the sample size, what you want to capture and wether you want to have a broad overview or focus on distribution tails. For estimating returns within each quintile, median has been used as the central tendency estimator. This measure is much less sensitive to outliers than arithmetic mean.
And finally the code to produce the Quantiles Return chart.
3 & # 8211; How to exploit the information above?
In the chart above Q1 is lowest past 12 months return and Q5 highest. There is an almost monotonic increase in the quantiles return between Q1 and Q5 which clearly indicates that stocks falling into Q5 outperform those falling into Q1 by about 1% per month. This is very significant and powerful for such a simple factor (not really a surprise though…). Therefore there are greater chances to beat the index by overweighting the stocks falling into Q5 and underweighting those falling into Q1 relative to the benchmark.
An IC of 0.0206 might not mean a great deal in itself but it’s significantly different from 0 and indicates a good predictive power of the past 12 months return overall. Formal significance tests can be evaluated but this is beyond the scope of this article.
The above framework is excellent for evaluating investments factor’s quality however there are a number of practical limitations that have to be addressed for real life implementation:
Rebalancing : In the description above, it’s assumed that at the end of each month the portfolio is fully rebalanced. This means all stocks falling in Q1 are underweight and all stocks falling in Q5 are overweight relative to the benchmark. This is not always possible for practical reasons: some stocks might be excluded from the investment universe, there are constraints on industry or sector weight, there are constraints on turnover etc… Transaction Costs : This has not be taken into account in the analysis above and this is a serious brake to real life implementation. Turnover considerations are usually implemented in real life in a form of penalty on factor quality. Transfer coefficient : This is an extension of the fundamental law of active management and it relaxes the assumption of Grinold’s model that managers face no constraints which preclude them from translating their investments insights directly into portfolio bets.
And finally, I’m amazed by what can be achieved in less than 80 lines of code with R…
As usual any comments welcome.
Risk as a “Survival Variable”
I come across a lot of strategies on the blogosphere some are interesting some are a complete waste of time but most share a common feature: people developing those strategies do their homework in term of analyzing the return but much less attention is paid to the risk side its random nature. I’ve seen comment like “a 25% drawdown in 2018 but excellent return overall”. Well my bet is that no one on earth will let you experience a 25% loss with their money (unless special agreements are in place). In the hedge fund world people have very low tolerance for drawdown. Generally, as a new trader in a hedge fund, assuming that you come with no reputation, you have very little time to prove yourself. You should make money from day 1 and keep on doing so for a few months before you gain a bit of credibility.
First let’s say you have a bad start and you lose money at the beginning. With a 10% drawdown you’re most certainly out but even with a 5% drawdown the chances of seeing your allocation reduced are very high. This has significant implications on your strategies. Let’s assume that if you lose 5% your allocation is divided by 2 and you come back to your initial allocation only when you passed the high water mark again (e. g. the drawdown comes back to 0). In the chart below I simulated the experiment with one of my strategies.
You start trading in 1st June 2003 and all goes well until 23rd Jul. 2003 where your drawdown curve hits the -5% threshold (**1**). Your allocation is cut by 50% and you don’t cross back the high water mark level until 05th Dec. 2003 (**3**). If you have kept the allocation unchanged, the high water mark level would have been crossed on 28th Oct. 2003 (**2**) and by the end of the year you would have made more money.
But let’s push the reasoning a bit further. Still on the chart above, assume you get really unlucky and you start trading toward mid-June 2003. You hit the 10% drawdown limit by the beginning of August and you’re most likely out of the game. You would have started in early August your allocation would not have been cut at all and you end up doing a good year in only 4 full months of trading. In those two examples nothing has changed but your starting date….
The trading success of any individual has some form of path dependency and there is not much you can do about it. However you can control the size of a strategy’s drawdown and this should be addressed with great care. A portfolio should be diversified in every possible dimension: asset classes, investment strategies, trading frequencies etc…. From that perspective risk is your “survival variable”. If managed properly you have a chance to stay in the game long enough to realise the potential of your strategy. Otherwise you won’t be there next month to see what happens.

No comments:

Post a Comment