Public Articles
sfw Projektbericht
and 1 collaborator
In February and March 2016, CorrelAid and streetfootballworld (sfw) co-organized a workshop series on using data to unleash the power of football for social good. In three workshops which focused on streetfootballworld's network member communications, campaigns and performance metrics nine experts from CorrelAid's network provided consultancy on a pro bono basis to streetfootballworld. The three workshops focused on the following themes:
Challenges for the CIO
and 1 collaborator
SWOT | Helpful | Harmful |
Internal | Strength · Cost reduction · Scalable IT resources · Technology is up to date · No investment risk * Server management is ourtsourced | Weaknesses · Dependent on provider · Risk of renouncing own IT competence · Working internet connection needed * Reliability of the cloud |
External | Opportuinities · Green IT * Drives innovation | Threats · Safety and data protection * Data protection regulations |
Несжимаемая неоднородная жидкость в условиях теплообмена
Какую жидкость считают несжимаемой? Часто под этим понимают жидкость, дивергенция скорости которой всюду равна нулю $\div \vec u \equiv 0$. На самом деле, это верно лишь в двух случаях: (1) термодинамическое равновесие наступает быстрее механического, а значит, распределение температуры оказывается квазиоднородным; или (2) частицы жидкости вовсе не обмениваются теплом друг с другом. Другими словами, в первом случае теплообмен очень интенсивен, во втором - отсутствует вовсе. Для жидкостей с неоднородным распределением температуры и при наличии механизмов внутреннего теплообмена условие несжимаести имеет другой вид, т.е. для них $\div \vec u \ne 0$. В данной работе условие несжимаемости устанавливается на основе более фундаментального условия неразрывности и уравнения теплопереноса.
В конце автор рассмотрел два приложения полученного здесь условия несжимаемости к кинематике жидкости и гидростатике - расчёту вертикальной скорости свободной поверхности и давления на дне жидкости. В результате получено, что вертикальная скорость свободной поверхности определяется не только дивергенцией скорости столбика под ней, но и совокупностью эффектов термического расширения и интенсивности теплообмена. На величину давления эффекты термического расширения и тепллообмена не сказываются.
REDES NEURONALES ARTIFICIALES PARA MODELAR LA PREDICCIÓN DE LA DEFORMACIÓN DE UN MATERIAL EXPUESTO A LA RADIACIÓN SOLAR
and 1 collaborator
Una Red Neuronal Artificial (Artifical Neural Network, ANN) es un modelo matemático que trata de emular a los sistemas neuronales biológicos en el procesamiento de información \cite{alejo2010analisis}.
Las ANN se basan en una estructura de neuronas unidas por enlaces que transmiten información a otras neuronas, las cuales entregan un resultado mediante funciones matemáticas. Las ANN aprenden de la información histórica a través de un entrenamiento, proceso mediante el cual se ajustan los parámetros de la red, a fin de entregar la respuesta deseada, adquiriendo entonces la capacidad de predecir respuestas del mismo fenómeno. El comportamiento de las redes depende entonces de los pesos para los enlaces, de las funciones de activación que se especifican para las neuronas, las que pueden ser de tres categorías: lineal, de umbral (o escalón) y sigmoideal, y de la forma en que propagan el error \cite{freeman1991algorithms}.
Existen varios algoritmos que permiten ir corrigiendo el error de pronóstico; uno de los más usados es el denominado “retro propagación”, que consiste básicamente en propagar el error hacia atrás, desde la capa de salida hasta la de entrada, permitiendo así la adaptación de los pesos con el fin de reducir dicho error \cite{hilera2000redes}.
En forma simplificada, el funcionamiento de una red “retro propagación” consiste en el aprendizaje de un conjunto predefinido de pares de entradas-salidas dados como ejemplo, empleando un ciclo de propagación–adaptación de dos fases: primero, al aplicar un primer patrón como estímulo para la capa de entrada de la red, éste se va propagando a través de las capas siguientes para generar la salida, la cual proporciona el valor del error al compararse con la que se desea obtener. A continuación estos errores se transmiten hacia atrás, partiendo de la capa de salida, hacia todas las neuronas de la capa oculta intermedia que contribuyan directamente a la salida, recibiendo el porcentaje del error aproximado a la participación de las mismas en la salida original \cite{ovando2005redes}..
Este proceso se repite siempre hacia atrás, capa por capa, hasta que todas las neuronas de la red hayan recibido un error que describa su aporte relativo al error total. Basándose en esta información recibida, se reajustan todos los pesos de conexión, de manera que la siguiente vez que se presente el mismo patrón disminuya la diferencia entre la salida calculada y la deseada \cite{ovando2005redes}.
La importancia de la red retro propagación consiste en su capacidad de autoadaptar los pesos de las neuronas de las capas intermedias para aprender la relación que existe entre un conjunto de patrones dados como ejemplo y sus salidas correspondientes.
Dependiendo del tipo de aplicación y sus características, se han desarrollo distintos tipos de redes neuronales, que han sido aplicadas de forma satisfactoria en la predicción de diversos problemas en diferentes áreas del conocimiento tales como Biología, Medicina, Economía, Ingeniería, Psicología, entre otras \cite{pol2000prediccion}.
La toma de decisiones es un punto clave para las áreas anteriormente mencionadas, ya que las decisiones deben de estar evaluadas por criterios de evidencia y experiencia. Como herramienta de criterios de evidencia se han utilizado los modelos de redes neuronales artificiales (ANN), para realizar las predicciones.
Algunos trabajos como los de Javier Trujillano en \cite{trujillano2004aproximacion} realiza estos métodos para la predicción de resultados de medicina, por ejemplo en el fracaso renal, con el objetivo de extraer conclusiones adecuadas acerca de la posible evolución de la enfermedad, en este artículo se comparó el uso de las ANN contra regresión lineal, el resultado fue favorable para las ANN, ya que para la regresión lineal hay que agregar más dependencias para aproximar resultados similares a los obtenidos con las redes neuronales.
Por otro lado, Alfonso Palmer en \cite{pol2000prediccion} realiza una predicción del consumo de éxtasis a partir de redes neuronales artificiales con el fin de descriminar quién consume éxtasis y quién no. Los resultados muestran que la ANN desarrollada es capaz de predecir el consumo de éxtasis a partir de las respuestas dadas a un cuestionario, con un grado de eficacia del 96.66%.
Sin embargo, Juan David en \cite{henao2006modelado} usa un modelo de redes neuronales artificiales para representar la dinámica del índice del tipo de cambio real colombiano, porque describe mejor la dinámica de la serie que un modelo lineal autorregresivo, como lo muestra el resultado del contraste del radio de verosimilitud. El modelo fue aceptado después de aplicarle una serie de pruebas estándar y de contrastar sus resultados con los obtenidos usando un modelo lineal autorregresivo. Los resultados indican que el valor actual de la serie depende únicamente de su valor anterior.
Particularmente, el uso de las ANN han tenido gran interés, debido a su capacidad para representar relaciones desconocidas a partir de los datos mismos.
El resto del artículo está organizado de la siguiente manera. En la sección [preliminares] se describe de manera general conceptos que se utilizan en este trabajo, mientras que la metodología se presenta en la Seccción 3. La fase de la experimentación y resultados se describe en la sección 4, seguido de las conclusiones.
A6
Form a pointset of all circle centers. Use a WSPD to find the closest pair of centers \(O(n\log n)\). Return disjoint
if the distance between these points is > 1, non-disjoint
else.
Disjoint circles
\(\iff\) all pairwise circle centers are > 1 dist apart (unit circles, r=1)
\(\iff\) the closest pair of points is > 1 dist apart \(\square\).
BSHMM : A model for Markov-based DNA methylation profiling and case study in diatoms.
and 3 collaborators
The internship took place in the Laboratory of Quantitative and Computational Biology in Paris. The lab is led by A. Carbone and is affiliated with both UPMC and CNRS. The research focuses on interdisciplinary computational biology, promoting a tight collaboration between theoretical and experimental approaches, both conducted in the same lab within seven different teams composed of biologists, computer scientists, statisticians and biophysicists. Under the supervision of Hugues Richard, I was part of the analytical genomics team whose area of research spans two main subjects : protein evolution and modelling and sequence evolution.
The idea of studying methylation patterns based on a statistical method was initiated during the first year of the master’s degree as a compulsory project. The goal was to construct and implement a model inspired by the Ph.D. thesis of Bogdan Mirauta with his active help and supervision. Guillaume Viejo, a fellow student at the time and myself had to repurpose Parseq \cite{Mirauta_2014}, a model aimed at RNA-Seq data analysis and modify it into a reliable DNA methylation profiling tool, starting from a library of sequencing data called BS-Seq.
A 6 months voluntary internship further extended this work. Even though the main motivation of the project has been kept the same, the statistical methods have been heavily simplified : from a sophisticated Monte Carlo combined with Gibbs particle sampling into a more practical and easier 3-layer hidden Markov process of order 1, more relevant to the aspirations of an internship research project. The tool has been almost entirely implemented during this period and dubbed BSHMM for BS-Seq Hidden Markov Model. It has been proven to be effective on simulated data but no validation had been conducted in real world conditions yet. In addition, during this year, I presented a poster presenting the tool at the CJC (Jeunes Chercheur des Cordeliers) meeting, which is mainly aimed at Ph.D. students.
This second internship was an immediate follow up to the development of BHSMM. We first sought to validate our results by comparing them to those of a different methylation experiment based on microarrays to draw the 5-methylcytosine (5mC) profile of Phaeodactylum tricornutum. The second part consisted of using the tool that we have implemented in a larger pipeline of analysis. Recent publications have shown how methylation profiles exhibit spatial periodicity and play an important role in the chromosome arrangement inside the nucleus of some diatom species via nucleosome linkage. \cite{Huff_2014} Besides, the same type of periodicity has been observed in the expression level of small RNAs, although it is still unclear whether these two single events are related to the same biological process. The goal is to figure out whether this periodicity is also present in P. tricornutum, and also if it is linked in any way to the placement patterns of small non coding RNA (snRNA) derived fragments.
Working title: Variation in grassland community trait patterns over climate gradients.
and 1 collaborator
A central goal in ecology is to identify and understand the processes that influence the distributions of species in space and time. Often, these assembly processes are not directly observable over feasible time scales and must instead by inferred through pattern \cite{Levin_1992}. One increasingly popular approach is to use the values and abundances of species traits in a community as evidence for the influence of particular assembly processes \cite{Cavender_Bares_2004,Ackerly_2007,Kraft_2008}. Trait-based approaches have several advantages over strictly taxonomic approaches in that they are quantitative, easily generalizable, and have explicit ties to ecological strategy and performance \cite{McGill2006,Violle_2007}.
Unfortunately, inferring process from community trait patterns is not always straightforward because different processes can lead to similar patterns, multiple processes can operate simultaneously on multiple traits, and patterns can be affected by exogenous forces. For example: community assembly is sometimes depicted as a balance between environmental filtering, in which species unable to tolerate environmental conditions are filtered out resulting in a clustering of trait values, and niche differentiation, in which competition and limiting similarity result in trait values that are more evenly spaced than expected by chance \cite{Cavender_Bares_2004,Kraft_2007}. But recent work has shown that environmentally-filtered communities can result in random or overdispersed trait patterns (e.g. when there is sufficient within-community environmental heterogeneity) \cite{DAndrea2016}, and competition-structured communities can result in clustering patterns \cite{Mayfield_2010}. In addition, pattern-based evidence of assembly processes can be obfuscated by propagule pressure from adjacent communities \cite{Leibold_2004}, or by fluctuating environmental conditions that favor different species over time \cite{Chesson_1981,Chesson_1994}.
Although it is unlikely that a single pattern-based test will ever provide incontrovertible evidence for niche differentiation, analysis of community trait structure can still shed light on assembly processes if used properly. Different metrics should be used in complementary ways to provide more detailed, and thus more interpretable characterizations of community trait structure. In one recent study, \cite{DAndrea2017} suggest a stepwise analysis pipeline in which potential niches along trait axes are identified using a clustering algorithm, and if clusters are identified, then the fine-scale abundance structure within each cluster is examined for evidence of distance-based competition. Next, tests of community trait structure should be conducted along environmental gradients where they can potentially be tied to mechanistic predictions derived from existing ecological theory \cite{Webb_2010}. Lastly, analyses of community trait structure should be used to develop and select hypotheses for experimental testing in the field, rather than be considered as compelling standalone evidence.
Here, we apply a suite of newly developed and classical metrics of community trait structure to a network of twelve grasslands positioned along temperature and precipitation gradients in southern Norway. Our tests include measures of clustering, fine-scale trait abundance structure, and whole-community trait abundance structure. We look for community-level patterns in four traits: leaf area, maximum potential canopy height, seed mass, and specific leaf area (SLA). Based on our knowledge of the system, we predict a gradual shift in importance of competitive interactions at the coldest sites to environmental filtering at the most stressful sites. We expect that competition for light will be the strongest competitive factor at the warmest sites, and thus there will competition-derived clustering in maximum height and leaf area. We expect there to be niche differentiation in SLA at the coldest sites, where there could be a tradeoff between risky fast-growth strategies and the ability to tolerate/avoid early season frosts. Ultimately, our work uses trait-based predictions of community assembly processes to glean information about the relative influence of assembly mechanisms on grassland community composition.
We measured four traits: leaf area (LFA), specific leaf area (SLA), maximum plant height (MXH), and seed mass (SDM). We standardize our traits by taking the logarithm of the trait value and rescaling the logarithms to range between 0 and 11. We applied our tests on each trait individually, as well as on the Euclidean space formed by these traits, which is a four-dimensional hypercube of side 1.
For each site we calculate its Rao quadratic entropy, defined as $Q=\sum_i^{S-1}\sum_{j=i+1}^S d_{ij}p_i p_j$, where pi and pj are the relative abundance of species i and j, dij is the absolute trait difference between them, and the sum is over all species pairs. It corresponds to the expected trait difference between two individuals randomly sampled (with replacement) from the community. We also used the functional dispersion metric proposed in \cite{Laliberte2010}, defined as the abundance-weighted mean distance di between each species i and the community trait centroid. That is, FDis = ∑ipidi. When a single trait is considered, this is simply ∑ipi|xi − ∑jpjxj|, where xi is the trait value of species i. Both indices have been used to quantify community functional diversity \cite{Botta-Dukat2005,Laliberte2010,Ricotta2011}. A high value indicates trait overdispersion, i.e. species cover a wider region of trait space than expected by chance. In contrast, a low value suggests that species are being filtered toward a particular trait value, possibly due to selection for optimal tolerance to local environmental conditions \cite{Keddy1992}.
In addition to test statistics based on trait dispersion, we also used a measure of the degree of even spacing between adjacent species on the trait axis. The metric is defined as CV = σ/μ, where μ and σ are respectively the mean and standard deviation of the distances between closest neighbors in trait space. When a single trait is considered, species can be ordered by trait value, and the distance vector is di = |xi − xi + 1| between adjacent species i and i + 1. A low CV indicates even spacing. Even spacing has been proposed as indicative of niche differentiation, as it maximizes exploration of niche space \cite{Mason2005}, and minimizes competitive interactions caused by trait similarity \cite{MacArthur1967}. On the other hand, recent work has raised the possibility that resource partitioning may actually lead to species clustering on the trait axis \cite{Scheffer2006}. In particular, clusters in trait space are expected if competitive exclusion is slow or if immigration replenishes species that are not niche-differentiated \cite{DAndrea2016}. Given this possibility, the coefficient of variation may actually be higher than expected by chance.
Although species may be clustered, they may still sort into niches that in turn are evenly spaced. This could occur if competition is caused by trait similarity \cite{Scheffer2006,DAndrea2017}. In that case, the most abundant species in the community might be expected to be evenly spaced even though the community as a whole is clustered. Based on these considerations, we used the CV in two metrics. First, we considered all species in the community without regard for abundance. A similar test statistic, the variance divided by the range, is commonly used to quantify evenness \cite{Stubbs2004,Kraft2008,Ingram2009}. Second, we gradually remove species from the community in increasing order of abundance, at each step calculating the CV among the remaining species. If the CV declines as the least abundant species are progressively removed, this suggests even spacing between niches concomitant with clustering between species.
Finally, we test for the presence of clusters directly by applying a cluster-finding method. Our metric uses a k-medoid clustering algorithm, which partitions trait space into groups (clusters) of species, each group with a specific medoid, i.e. the species that is closest to all other members of its group. It is an iterative process which alternately decides cluster membership and medoid identity by minimizing the average distances in trait space between species and the medoids of their clusters \cite{Kaufman1990}. We implement the algorithm using the function clara in R package cluster \cite{Maechler2016}. For each community-year, we find the number of clusters that best fits the data using R’s optim function for Markov chain Monte Carlo optimization \cite{RCoreTeam2015}. The quantity being optimized is the average silhouette width, a measure of how similar individuals are to their own cluster compared to neighboring clusters \cite{Kaufman1990}. Once the optimal number of clusters is found, the test statistic is the optimized average silhouette width. We then test for clustering by comparing the test statistic against the set of null communities.
In order to create null communities against which to compare our data, we used a mainland-island approach, where each site undergoes zero-sum birth-death neutral dynamics and immigration from a fixed regional species pool \cite{Hubbell2001}. For each site, the regional pool includes all species falling within the observed trait range, with the regional abundance of each species calculated as the mean across all sites. For each site we estimated immigration rates by fitting a neutral model to the observed relative species cover, and estimated community size by matching the neutral simulated communities to observed species richness. Estimated community size ranged from 215 individuals for Fauske to 567 for Gudmedalen, and immigration rate ranged from 0.03 for Ovstedal to 0.53 for Lavisdalen. For each site we simulated 1,000 neutral communities.
To test for significance, for each of our sites in a given year we compare the metric value to the (1 − α)-quantile of the corresponding set of null communities. Of our five metrics, three (Rao, FDis, CV) are two-tailed, as both low and high values can be interpreted to suggest specific community assembly processes, while the other two (CVtrend, Clara) are one-tailed. We use significance level α = 0.025 for the two-tailed tests and α = 0.05 for the one-tailed tests.
Fig. 2 summarizes our results for the 2009 census. Bar plots show the percentage of the 12 sites that tested significant against the set of null communities. We focus on the 2009 census, but our results were consistent across the years (see Figure S2 in the Supplement), indicating that deterministic factors are playing a role in the trait structure of our communities.
Leaf area and SLA, which are related traits, had similar results across most tests. Between 30% and 50% of sites were significantly overdispersed according to Rao and FDis. A smaller percentage (20%) of sites were significantly underdispersed in SLA. The CV was significantly high for leaf area in 50% of the sites, indicating uneven spacing between adjacent species. Results were weaker and more ambiguous for SLA: spacing between adjacent species was significantly even in 20% of sites, and significantly uneven in another 10%.
In contrast, seed mass showed the strongest indication of underdispersion. 30% and 50% of sites had significantly low Rao and FDis indices, respectively. Furthermore, there was no significant evenness in any of the sites according to the CV metric. And 25% of the sites showed a significant negative trend in CV as low-abundance species are removed.
Results were ambiguous for maximum plant height. Rao and FDis results were relatively strong but split between significant overdispersion and underdispersion, with the latter being a slight majority. Our CV result was also ambivalent, with 30% of sites indicating even spacing between species while another 20% indicate the opposite pattern. 20% of sites had a significant negative trend in CV as rare species are removed.
When all four traits were considered together in a Euclidean trait space, results were somewhat ambiguous for the functional dispersion metrics but tended towards overdispersion (30% overdispersion against 20% underdispersion). According to the CV, species were evenly spaced in this multidimensional space in 20% of the sites, and were not significantly uneven in any site.
Rao and FDis results were largely consistent with each other for all traits and the Euclidean space, corroborating previous results that indicate these two statistics are related \cite{Laliberte2010}.
A low percentage of sites, between 10% and 25%, showed evidence of significant clustering according to the CV trend and Clara metrics. Particularly for Clara, numbers were consistently low across traits and the Euclidean space, averaging just above 10% detection of significance. Given the null expectation of significance in 5% of the sites because of our α = 0.05 significance cutoff, these results suggest that species are not sorting into distinguishable clusters in our sites.
Figure 3 shows the variation in the standard score of our Rao results against mean summer temperature of our sites2. We see a significant trend in Rao scores against temperature for SLA and max height, plus the Euclidean space. The trend is negative in all cases, indicating that colder sites tended to be more overdispersed than warmer sites.
Results for the other metrics across the years are shown in Figure S3 in the Supplement and summarized in Table 1. Aside from FDis, which showed similar trends as Rao for the same traits, we found a negative trend in CV for leaf area in two years and for seed mass in one year, and a positive trend in Clara for SLA and the Euclidean space in one year. We also see that for SLA the positive slope in CV as low-abundance species are removed was slightly steeper in higher temperatures, whereas in leaf area, max height, and the Euclidean space the opposite was observed. It should be noted that although consistent across years, those trends were weak and the standard scores involved had small magnitude.
Trends were for the most part consistent across years. No trait showed opposite trends in different years, and many trends were observed in all four years, while some occurred in one, two, or three years (Table 1, see also Fig. 3S). We also checked for trends against mean annual precipitation, but found largely nonsignificant results (Fig. 3S).
Our results indicate that, relative to the regional pool, the leaf traits were often overdispersed in our local alpine communities, in the sense that species with extreme trait values tend to be more abundant than would be expected from a random draw from the pool. We found no evidence that species in local communities are evenly spaced on the leaf trait axes; on the contrary, species tended to be unevenly dispersed in leaf area. There was some suggestion however that even spacing occurred between the most abundant species. Lastly, species rarely seemed to form recognizable groups in these leaf traits. The trait overdispersion concomitant with the lack of even spacing are compatible with the hypothesis that species are being selected into distinct functional groups or niches, but within each niche species either compete neutrally or are selected for a particular trait value.
Seed mass showed the opposite behavior of leaf traits, as a sizeable fraction of our local communities were underdispersed in seed mass: species with a particular seed mass tended to occur more frequently or be more abundant than those deviating from the optimum, possibly because they are better adapted to local conditions or because they are better competitors. There was some suggestion of
Even spacing between adjacent species was distinctly observed in the Euclidean space formed by leaf area, SLA, maximum plant height, and seed mass. Spacing seemed more even between the more abundant species in about one in five sites. Trait dispersion results were ambiguous in the Euclidean space, with the number of significantly underdispersed communities roughly matching that of overdispersed communities. Overall, these results are compatible with the classical idea that species avoid competition by maximizing interspecies distances in niche space.
\label{fig:Fig1}
\label{fig:Fig2}
\label{fig:Fig3}
\label{table:Table1}
Mathematically, we standardize by defining yi = (log(xi)−log(xmin)) / (log(xmax)−log(xmin)), where xi is the trait value measured for species i, and xmin and xmax are the lowest and highest trait value observed in the data.↩
The standard score measures the difference between the data and the null communities relative to the variation across the nulls. If the test score in a site was x, and the mean and standard deviation of the null scores were respectively μ and σ, then the standard score is z = (x − μ)/σ.↩
Алгоритмы локации и маршрутизации. Алгоритм Калмана-Петрова
and 1 collaborator
Предположим, что входной сигнал описывает некоторый авторегрессионный процесс первого порядка: \begin{equation}\label{lab1} x_{t} = \phi \cdot x_{t-1} + u \cdot t + \upsilon_{t} = 0,26 \cdot x_{t-1} + 0,8 \cdot t + \upsilon_{t} , \end{equation} где $\upsilon_{t} \sim N(\bar{m_{\upsilon}},\sigma^2_{\upsilon})$ - некоторая помеха сигнала, произвольная случайная величина, распределенная нормально с параметрами $\bar{m_{\upsilon}} = 0$ ( мат. ожидание), а συ2 = 0, 2 (дисперсия помехи сигнала). Уравнение наблюдение будет выглядеть следующим образом: \begin{equation}\label{lab2} y_{t} = \gamma \cdot x_{t} + \epsilon_{t} = 0,72 \cdot x_{t} + \epsilon_{t} , \end{equation} где $\epsilon_{t} \sim N( \bar{m_{\epsilon}},\sigma^2_{\epsilon} )$ также случайная величина - помеха при наблюдении, распределенная нормально с параметрами $\bar{m_{\epsilon}} = 0$ (мат. ожидание), а σϵ2 = 5 (дисперсия помехи наблюдения).
Более того предполагается, что помехи наблюдения и сигнала некоррелированы ( т.е E(ϵt − iυt − j, ∀i, j) ).
Целью данной работы являлась симуляция работы фильтра Калмана \cite{Gen_ay_2002} на некотором сигнале ([lab1]),где в качестве входных данных выступают дисперсии помехи сигнала συ2 и помехи наблюдения σϵ2.
Алгоритмы локации и маршрутизации. Алгоритм Калмана-Салова
and 1 collaborator
Обозначим за xt величину, которую мы будем измерять, а потом фильтровать. Мы будем измерять координату бронепоезда, который может ехать только вперед и назад. Движение бронепоезда задано формулой: $$ x_{t}=5+2\cdot t+0.1\cdot t^{2} $$ Выразим координату бронепоезда через ускорение и предыдущую позицию бронепоезда: $$ x_{t+1}=5+2\cdot (t+1)+0.1\cdot (t+1)^{2}=5+2\cdot t+0.1\cdot t^{2}+2+0.2\cdot t+0.1=\\=x_{t}+0.2\cdot t+2.1 $$ т.к. взяв два раза производную от xt получим, что ускорение равно 0.2, то координата бронепоезда будет изменяться по закону: $$ x_{t+1}=x_{t}+a\cdot t+2.1 $$ где a=0.2 Но в реальной жизни мы не можем учесть в наших расчетах маленькие возмущения, действующие на бронепоезд, такие как: ветер, качество рельс и т.п., поэтому настоящая координата бронепоезда будет отличаться от расчетной. К правой части написанного уравнения добавится случайная величина Et $$ x_{t+1}=x_{t}+a\cdot t+2.1+E_{t} $$ Мы установили на бронепоезд GPS сенсор. Сенсор будет измерять координату xt, но, к сожалению, он не может точно измерить ее и мерит с ошибкой Nt,которая тоже является случайной величиной: $$ z_{t}=x_{t}+N_{t} $$ Задача состоит в том, чтобы, зная неверные показания сенсора zt, найти хорошее приближение для истинной координаты бронепоезда xt. Это приближение мы будем обозначать xtopt. Таким образом, уравнение для координаты и показания сенсора будут выглядеть следующим образом: $$ \begin{cases} x_{t+1}=x_{t}+a\cdot t+2.1+E_{t}\\ z_{t}=x_{t}+N_{t} \end{cases} $$
Rebalancing: cppi
Before delving into the topic of discrete rebalancing, I describe the trading policy induced by cppi products. I stay within the context of the two assets model, i.e. cash (with zero interest rate) plus a risky asset. As usual, the risky asset is initialized with price \(p=1\). Portfolios are initialized with one dollar.
Consider a portfolio which trades so as to keep the proportion of the risky asset at \(0 \lt \pi \lt 1\). A cppi overlay consists in protecting a certain level \(\underline{p}\) with I choose to set at \(1/2\) for illustration. I'll assume that the exposure to the risky asset is decreased linearly from \(\pi\) at \(p=1\) to zero at \(p=1/2\). This gives the following exposure to the portfolio:
\[\pi(p)=\pi-2(1-p),\, 1/2 \lt p \lt 1,\]
\[\pi(p)=0,\, p \lt 1/2,\]
\[\pi(p)=\pi,\, p \geq 1.\]
I give the calculations below, but the next two graphs illustrate the results. As is clear from the graphs (and intuitively obvious given our training), the trading policy is contrarian for \(p \geq 1\) and momentum for \(p \le 1\). Downside protection forces to sell the asset when its price has gone down, and to buy it when it has gone up.
This closes the illustrations. I'll now move on to investigate the way contrarian and momentum trading interact with discrete time rebalancing. In the context of discrete time rebalancing, price cycles hurt momentum trading while it benefits contrarian trading. It is clear from all the examples we have set that continuous trading, in the ideal situation of smooth price trajectories, is completely neutral vis-à-vis price cycles. Price cycles don't create nor destroy value in such a context. This opens the door to a certain type of performance attribution which precisely measures the contribution of price cycles to discretely rebalanced portfolios.
Kalman filter example for Kirillov A.N.
Выполнил: Басимов Владислав Альбертович (гр.22508)
Преподаватель: Кириллов Александр Николаевич
Фильтр Калмана — это алгоритм обработки данных, который убирает шумы и лишнюю информацию. На вход подаётся набор измерений. Предполагается, что эти измерения всегда наделены некоторой ошибкой –это обуславливается погрешностью измерительных приборов. В простейшем случае получаемые с помощью прибора измерения(сигнал) можно описать в виде суммы полезного сигнала и ошибки. Поскольку погрешность измерения есть у любого прибора, то она уже передается сразу вместе с сигналом, и нам, как раз, нужно найти этот исходный сигнал, убрав ошибку. В этом заключается задача фильтра Калмана, то есть, необходимо отфильтровать (отсеять) из полученного сигнала только истинное значение сигнала, а искажающий шум (ошибки измерения) убрать.\cite{Kalman_1960}
Для выполнения практической работы, было принято решение, что для применения фильтра Калмана будет использоваться следующая информация о входящем сигнале:
Моделируется эксперимент, при котором в тихой комнате при помощи микрофона считывается некоторый гудящий звук, громкость которого постоянно увеличивается. В качестве входного сигнала для фильтра Калмана берется амплитуда звуковой волны. Амплитуда данного сигнала будет расти с течением времени (нарастающие колебания) – рис.1. При проведении эксперимента используется микрофон не очень хорошего качества, поэтому при считывании звука накладываются некоторые помехи на получаемый сигнал.
Structure Function of M51
and 1 collaborator
Structure Function of M51 at Various Locations Within the Galaxy
A scalable method for automatically measuring pharyngeal pumping in C. elegans
and 1 collaborator
We describe a scalable automated method for measuring the pharyngeal pumping of Caenorhabditis elegans in controlled environments. Our approach enables unbiased measurements for prolonged periods, a high throughput, and measurements in controlled yet dynamically changing feeding environments. The automated analysis compares well with scoring pumping by visual inspection, a common practice in the field. In addition, we we observed overall low rates of pharyngeal pumping and long correlation times when food availability was oscillated.
Assistive Technology on College Campuses in Washington State
and 2 collaborators
In 2010, the U.S. Census Bureau released a report which stated that there are about 56.7 million Americans who have a disability, making them the largest minority (Brault, 2012). The term disability may imply a broad range of physical, cognitive, emotional and mental conditions which limit one’s way of life; individuals with disabilities meet a broad range of technological barriers, often times more than one (Pub. L. 110-325, 1990). Many Americans with disabilities experience less career success than their non-disabled peers (Kulkarni, 2014). As post-secondary education becomes increasingly crucial for employability, it is becoming more common for Americans to pursue higher education. Within the population of those with disabilities who do pursue post secondary education, the attrition rate is high due to many barriers, including the accessibility of curriculum materials, electronic equipment and resources (Belch, Holley A, 2004). K-12 institutions are required to maintain different legal mandates than those that post-secondary institutions are held accountable for, making the academic and technological transition from secondary to post-secondary difficult and often ambiguous for many students with disabilities. This paper solely focuses on the current legal and technological state of AT services in higher education within Washington State.
The term assistive technology has been legally defined as having two seperate meanings. An assistive technology device is considered to be “any item, piece of equipment, or product system, whether acquired commercially off the shelf, modified, or customized, that is used to increase, maintain, or improve functional capabilities of individuals with disabilities” (Pub. L. 106-398, 2000). An assistive technology service is defined as “any service that directly assists an individual with a disability in selection, acquisition or use of an assistive technology device” (Pub. L. 106-398, 2000).
The Rehabilitation Act, the Tech Act, the Individiuals with Disabilities Education Act (IDEA) and the Americans with Disabilities Act (ADA) all contribute to the legal requirements to provide AT services to students with disabilities. Section 504 of the Rehabilitation Act of 1973 “prohibits the discrimination of people with disabilities under any government run/funded program, business, establishment, etc.” (Pub. L. 93-112, 1973). Section 504 also requires that these agencies make proper accommodations for those with disabilities. Section 508 of the Rehabilitation Act was added by Public Law 99-506 as an amendment which ensures individuals with disabilities access to computers and other electronic office equipment (Pub. L. 93-112, 1973).
The Tech Act (otherwise known as Public Law 100-407) was signed into law in 1988 for the purpose of guiding states to begin the development and implementation of systems which will provide a variety of technological assistance to all individuals with disabilities, as well as their parents and legal guardians. The main role of the Tech Act is to help provide financial assistance to states for identifying assessing accommodation needs and technological resources, providing assistive technology services and conducting public awareness programs. (Pub. L. 104-334, 1988)
IDEA was passed in 1975 which ensures that students with a disabilities are provided with Free Appropriate Public Education (FAPE) (Pub. L. 101-476, 1990). However, the IDEA holds only secondary institutions to a legal mandate as K-12 education is a legal right. Because higher education is an option, not a legal right, college institutions are not required to comply with IDEA (Pub. L. 101-476, 1990).
ADA, which was enacted in 1990, “prohibits discrimination against persons with disabilities in the areas of accessibility, employment, public services, public accommodations, transportation and communication” (Pub. L. 101-336, 1990). This means all post-secondary institutions are legally required to provide all students with equal access to academic materials, facilities, or other tools necessary to graduate. It does not specifically define technological accommodations, however it provides legal basis for students to individually request accommodations from colleges.
In a 2005 unfunded mandate survey provided by the U.S. Conference of Mayors, 38 U.S. cities provided the total reoccurring annual cost of upholding ADA, $24,445,506 (U.S. Conference of Mayors, 2005). A reasonable and vital goal for the U.S. public is to urge government funding within legal mandates like ADA, which can often require cities and public institutions to undergo large renovations and projects, discouraging real progress when there is a funding deficit.
Three separate onsite interviews were conducted with the assistive technology program directors at The Evergreen State College, Pierce College, and University of Washington (Seattle campus) respectively, in the initial stages of project planning to gain a more complete understanding of the current state of assistive technology at college campuses in Washington. Interview questions were designed to examine standard services provided in dedicated and shared spaces, the process for receiving accommodation requests, any plans to implement new or additional assistive technology, accessibility of paper and online materials, and OCR complaints (if any).
The following is a brief description of the key observations that were collected:
The Evergreen State College, a public liberal arts college, has a designated Assistive Technology Lab that is separate from the general use computing center, equipped with computer stations that are loaded with Read & Write Gold, JAWS, Dragon NaturallySpeaking, Inspiration, ZoomText and LearningAlly. Physical assistive devices such as ergonomic keyboards, alternative mice, keyboards with large font and Dragon-certified headsets are also available to patrons of the lab (“Assistive Technology (AT) Lab”). A standalone AT Station, consisting of two machines loaded with the same software, is located on the general use computer center floor. The Evergreen AT Lab expressed concern over how crucial program materials such as syllabi are often not accessible to students, particularly ones who use screen readers, due to the fact that tools such as Style headings in Word (which allow screen-reader users to easily tab through section headings) are not used. The AT directors mentioned there were several resources, such as a braille printer, that they are currently unable to provide due to a combination of lack of adequate funding and the specialized staff that would be required to operate it.
Pierce College, a two-year institution, also has a designated AT lab. The lab machines are equipped with the following software: Dragon NaturallySpeaking, Inspiration, MathType, Open Book, JAWS, ZoomText, WYNN and Text Help. Ergonomically assistive devices such as Dvorak keyboards, Sip-and-puff technology, computer mice with trackballs and adjustable tables are also available to students. Pierce College also allows for certain types of equipment–ergonomic armrests, digital recorders, ergonomic chairs and tables–to be checked out to students for use outside of the Assistive Technology Lab’s operating hours. Pierce College expressed interest in moving towards free or open source software in an attempt to decrease operational costs and free up the budget for more resources. Like Evergreen, the program coordinator expressed great concern for the difficulties that arise when working as a liaison between the student requesting accommodations and faculty members, for the reason that faculty often see integrating accessibility into the curriculum as an afterthought instead of an integral step in the development of the curriculum. An additional obstacle that was highlighted was the lack of adequate funding for the consistent replacement of old or outdated equipment.
University of Washington (UW), a large public research university, operates a standalone Access Technology Center. In addition to having a standalone AT lab, UW is in association DO-IT (Disabilities, Opportunities, Internetworking, and Technology) a program that was designed to increase participation of students with disabilities within academia and the workforce by providing resources for students, educators, parents, and employers (“Disability Services Office”). The lab machines are loaded with the following software: Apple Universal Access, ClaroRead, Dragon NaturallySpeaking, FineReader, JAWS, NaturalReader, Read & Write Gold, Windows Magnifier, and ZoomText (“Accessible Software”). The AT Center reported that since the deployment of assistive software to the general computer labs on campus, there has been a significant decrease in AT Lab use. UW has a selection of ergonomic keyboard and mouse alternatives as well as adjustable tables and ergonomic chairs. The AT Center also provides embossing services for the creation of tactile graphics and braille. Due to the fact that the University of Washington is considerably a large institution, receiving funding for the AT Center was not an perceived issue by the directors of the program
The initial objective for researching assistive technology in academia was to examine two areas of interest: the integration of accessibility principles into computer science curriculum among post-secondary institutions and the question of whether or not universal accessibility is achievable.
Preliminary research and interviews with directors of three assistive technology labs in Washington state lead to the conclusion that there is an abundance of assistive software and hardware that currently exist, with varying levels of user satisfaction. The primary challenge being faced by students with disabilities is not a lack of effective assistive technology, but rather a lack of integration of accessibility principles into post-secondary education as a whole. Legal ambiguity and lack of funding contribute heavily to this lack of support for students with disabilities. College officials rely heavily on students to report a need for an accommodation.
Furthermore, the development of technology over the past decades has resulted in an increased use of digital materials in the classroom, from entrance examinations, to course syllabi, to full length recorded lectures. The transition towards an increasingly web and technology driven curriculum presents a particular set of barriers to students with disabilities. A research study titled “Monitoring for Accessibility and University Websites: Meeting the Needs of People with Disabilities” published in the Journal of Postsecondary Education Disability found that in 2011 “only 51% of 509 web pages at a large public university in the northeastern United States” were evaluated to be accessible, with the most common errors listed as follows: “‘Form Label missing’, ‘Alt-tag’ missing, empty links, improper heading structure, and issues with the footer” (Solovieva, 113). Accessible website design is especially necessary for users with some form of visual impairment who rely on screen readers to navigate web pages.
Following preliminary research, the decision was made to narrow the scope of the project to specifically measure the quality of assistive technology on college campuses in Washington. It was decided that a survey would be developed to comprehensively assess the resources available and the procedure for receiving accommodation at each individual college, with the intent of eventually importing the data into a database that could be dynamically displayed on an accessible web page and made searchable for prospective students.
Rebalancing : short portfolios
Rebalancing a leveraged portfolio to ensure constant proportions leads to momentum trading. Indeed as the price increases, the proportion of the risky asset tends to fall under buy-and-hold. The trader needs to buy more of the rising risky asset to keep its proportion in portfolio constant.
What about a short portfolio?
As usual, I take a world with two assets, cash bearing zero interest rate and a risky asset with price \(p\). At inception, investment is initiated with one dollar. The initial price is \(p_{0}=1\). Shares are sold until the risky asset has a proportion of \(-\pi \lt 0\) in the overall portfolio. The position in cash is thus \(1+\pi \gt 1\).
Assuming the portfolio is continuously rebalanced and the price trajectory is smooth, its value as a function of the price is given by:
\[V(p)=p^{-\pi},\]
and the number of shares held as a function of the price is:
\[n(p)=-\pi p^{-\pi-1}.\]
This is an increasing function of the price so that the investment policy consists in buying more shares as they go up, and selling shares as they go down. Rebalancing to a short exposure is a momentum policy, as opposed to a contrarian policy.
I show the short trading policy and the short value functions versus the corresponding buy-and-hold quantities in the graphs below. As explained, the trading policy buys the stock if it rises and sells the stock if it falls, leading to a convex value function.
gravity wave
and 17 collaborators
Welcome to Authorea!
Hey, welcome. Double click anywhere on the text to start writing. In addition to simple text you can also add text formatted in boldface, italic, and yes, math too: $R_{Blackhole}=\frac{1}{2}$! Add images by drag’n’drop or click on the “Insert Figure” button.
Analysis of soccer matches with clustering of player trajectories and Mutual Information based metric
and 1 collaborator
Abstract
The main goal of this project is to extract high level semantic cues from soccer match video sequences, using machine learning and computer vision techniques.
During the first year, I explored the state-of-the-art techniques that could be used to extract the player trajectories given the videos of a match recorded with several static camera.
During the second year, I focused on the player trajectories clustering problem. I particularly focused on designing a reliable and efficient metric to establish correspondences between trajectories.
Rebalancing and leverage
Rebalancing does not need to have a contrarian flavour. For this reason, the claim that rebalancing has benefits is empty. It needs further qualification. I illustrate this using a leveraged portfolio.
As usual, I take a world with two assets, cash bearing zero interest rate and a risky asset with price \(p\). At inception, investment is initiated with one dollar. The initial price is \(p_{0}=1\). The initial amount is leveraged to reach an exposure to the risky asset of \(\pi>1\). The position in cash is thus \(1-\pi \lt 0\).
Assuming the portfolio is continuously rebalanced and the price trajectory is smooth, its value as a function of the price is given by:
\[V(p)=p^{\pi},\]
and the number of shares held as a function of the price is:
\[n(p)=\pi p^{\pi-1}.\]
This is an increasing function of the price so that the investment policy consists in buying more shares as they go up, and selling shares as they go down. Rebalancing to a leveraged exposure is a momentum policy, as opposed to a contrarian policy.
I show the leveraged trading policy and the leverage value function versus the corresponding buy-and-hold quantities in the graphs below.
Predicción del indice carbonilo del poliestireno expandido utilizando aproximación de funciones con redes neuronales artificiales
and 1 collaborator
En este articulo se emplea un modelo matemático de aproximación de funciones utilizando Redes Neuronales Artificiales (RNA),el cuál se ajusta al comportamiento en la degradación del EPS (Poliestireno Expandido) en un CPC (Concentrador Parabĺico Compuesto). El objetivo es obtener el valor del índice carbonilo (IC) del EPS . Los valores de entrada obtenidos experimentalmente cuando se concentra energía solar sobre el absorbedor del CPC están compuestos por algunos de los factores que influyen en el deterioro del EPS, los cuales son: temperatura (T), tiempo de exposición de la muestra (t) y radiación UVB (W/m2). La aplicación de este modelo resulta útil para predecir el índice de deterioro del polímero sometido a concentración solar en el CPC a lo largo del año.
Facilitating pediatric diagnosis using dynamic programming approach
and 1 collaborator
The scenario of maternal and pediatric health in the Philippines has been an ongoing problem. This paper discusses a tool developed to expedite the process of diagnosing, informing the patient and prescribing the appropriate medication using novel techniques in dynamic programming. With the integration of a medical knowledge base, existing patient data and an inference engine, we are able to generate a case specific advise acting as a decision support system for medical practitioners. The goal to facilitate an improved delivery of diagnosis aiding physicians in giving timely and appropriate advice. A team of pediatric physicians piloted and tested this tool garnering an average usability rating of 9.2 over 10.
Keywords: Inference engine, dynamic programming, knapsack, e-health;