Paper Review :- Attention is all you need

The Idea :- Complex Recurrent neural networks include an encoder and a decoder which are connected through an attention mechanism . The paper proposes that we don’t need recurrence but only attention mechanism can produce great result .This is called a Transformer model architecture which is a paradigm shift in sequence processing as attention reduces the path length and reduces the computation steps ( which otherwise leads to information loss )

Traditional =. RNN + Attention . The Paper Proposes :- Only Attention

How RNN’s work :- RNNs perform sequential computation and precludes parallelisation which becomes critical as longer input sequences are encountered . Very hard for RNN to understand the long range dependencies. Although factorisation tricks and conditional computation do come to rescue but sequential processing still remains a constraint . RNNs take a current input and the last hidden state and determine the current hidden state . Attention is a mechanism to improve the performance of the RNN . In popular attention mechanism the decoder is taught to pay attention to the hidden states in the encoder model .

Other key words :- Self attention , end to end memory networks

The Proposed architecture :- The path length of information is much shorter now . The decoder decides to look at the hidden state as per an addressing scheme

Input embedding and Output Embedding – goes into the network

Positional Encoding – where the words are and gives the network a significant boost

Attention is over the input sentence ( the hidden state ) , attention over the hidden state of the part of the output sentence Already produced and the third multi head attention combines the input and output ( Encoded forms ) . Encoder of the source sentence builds KeyS ) way to index the value ) and value pairs and the Other part of the network builds the Query .

Keys, Value and the query

Attention(Q,K,V) = softmax(QK)

each Key has a corresponding value And then we introduce the Query . Compute the dot product Of the Keys and the Queries and select the key with the biggest dot product with the query and then apply the softmax ( exponential and normalisation ) to select the biggest dot product

Dot product between two vectors gives the angle between the two vectors

How to read a research paper

Wisdom is not a product of schooling but of lifelong attempt to acquire it

  1. Collate resources on the topic
  2. Extract only vital information (abstract and Conclusion ) even if you don’t understand the whole paper
  3. Take structured notes and then re-read the Paper with all details
  4. Read supplemental material as needed .

Portfolio Theory

Portfolio Theory – some salient points

  • Risk of the average is not equal to the average of the risk
  • shifting the efficient frontier to North west , the portfolio can be optimised to an extent
  • efficient frontier :- standard way of showing return and risk relationship
  • Expand the universe of investible asset classes to produce a new efficient frontier
  • correlation
  • strategic asset allocation ( defined by objectives ) and tactical asset allocation ( depends upon investment policy )
  • Rebalance Portfolio if financial markets have different expected returns
  • MPT ( modern Portfolio Theory ) by Markowitz in 1952 – portfolio needs to be diversified
  • Distribution of returns ( can be derived by change in prices )
  • Metric used for measuring dispersion – standard deviation ( from the average )
  • scatter plot for measuring correlation ( -1 to 1 ) – measure for dependence
  • Three measures :- Expected return , standard deviation and Correlation
  • optimal Portfolio creation need the above three measures only
  • variance = STD * STD
  • covariances – correlation * std1 * std2 – can take any value
  • Value appreciation vs. risk undertaken ( risk / return trade off )
  • efficient frontier : highest return at a specific level of risk

Don’t run for trains

I don’t run for trains

This is the career advice that I read and liked the most in the excellent book “The Black Swan”

1. Teach yourself to resist running to keep on schedule . In refusing to run to catch the trains ( at various stages in your life) , lies the true value of elegance and aesthetics in behaviour , a sense of being in control of your time , your schedule and your life .

2. “Missing a train is only painful if you run after it !”
Similarly if you seek the idea of success that others expect from you then not matching those expectations become painful – have more control over your life by deciding the criteria of success by yourself.

Miss a TRAIN and not FEEL the PAIN.

China – some history and way beyond

Early 80’s :-
Zhang Ruimin ( known for his work in turning a little-known, bankrupt refrigerator manufacturer into the world’s largest white appliances company – Haier ) once lined up about 75 refrigerators that had been produced with some minor defect . He asked his staff as to what should be done with the defective refrigerators . Responses were – sell them at discount , offer to employees etc.
He instead did something which is legendary in Chinese business history – destroyed the first refrigerator using a big hammer and then forced his staff to do the same with all the other refrigerators and made a very emphatic statement about the manufacturing standards that a world class company should follow. That set the tone for Haier and in a way for China in the years to come .

Around 2004 :-
When fighting eBay’s arrival in China , Jack Ma famously declared “eBay may be a shark in the ocean , but I am a crocodile in the Yangtze River . If we fight in the ocean we lose , but if we fight in the river we win” and he launched Taobao. The website , offering free service to buyers and sellers on the consumer to consumer platform , created china’s first secure online payment system ( Alipay ) . Next in line were money market products called Yu’e Bao – use Alipay account to freely deposit and withdraw money and avail better rates than traditional bank products .

And then rest is history
To me the two episodes that I have mentioned above highlights the major mind shift changes that have happened in the China’s economic history .

But understand the history ( and the market ) first

There is government, when the prince is prince, and the minister is minister; when the father is father, and the son is son. (Analects XII, 11, trans. Legge)

Confucian believed that people must behave , rulers must govern and control is always necessary and hence China has a history of long rejection of free market economy , markets need to be controlled – otherwise cacophony and inefficiency prevails .

Since the early 1990s, China has consistently been the world’s fastest growing economy . It has moved from a state controlled to a market driven economy. The economic scale and size of market enables companies like Alibaba & Tencent to compete with global giants. China , although also being a complex and fickle market , has been using technology effectively ( and at a massive scale ) and is becoming a harbinger of how the world’s business environment will evolve.

What’s the future now :-
China’s poster boy companies today – Baidu , Alibaba and Tencent ( BAT ) and many more like Xiaomi , Lenovo , Huawei , Yihaodian are the most innovative , creative and influential players – not in the Chinese economy but slowly into the regional economy as well . The imaginative entrepreneurs who have build these companies from the ground up have shown phenomenal drive and ability in negotiating the extraordinary changes China and its economy have undergone .

China will continue to open and liberalise its market and the World will be forced to respond. One milestone event was China entering WTO in early 2000s and the other one more recent – programs like stock connect ,QFII , RQFII quotas , MSCI decision to include China A shares in the MSCI emerging market index .

. )

China has learnt a lot in the last few decades and now has a lot to Share

Exponential Learning

                                     The power of exponential Learning

In the book “The Second Machine Age” the authors talk about the Moore’s law and the power of the exponential function. While reading the fascinating article , I realised that similar thought process can be applied to understand the need of enhancing ones skills in this age of instant learning and philosophical tricks on how to do it .

Gordon Moore ( cofounder of Intel ) wrote in 1965 :- “The complexity for minimum component costs has increased at a rate of roughly a factor of two per year ” . The article was aptly named “Cramming more components onto Integrated circuits ” . Very interestingly this is not a law ( with a well defined formulae and corollaries ) but rather a salute to the possibilities of engineering and science and the ability to disrupt and innovate .

Similarly the need of minimum skill enhancements , to stay relevant in the fast world of evolving technologies and the generation of millennial and Digital natives , has increased multiple folds .
The need to have the ability to cram multiple skills ( and keep de-skilling and re-skilling ) in the integrated circuits of your mind is a state of affairs that would continue persistently for multiple decades .

The key to continue having a successful and impactful contribution to your career is to keep engaging in “Brilliant Tinkering – finding detours around your normal route of learning and avoiding the roadblocks thrown up by multiple reasons to stay in the state of inertia”

When it became difficult to cram integrated circuits more tightly together , chip makers instead layered them on top of one another , opening up a great deal of real estate . We also need to keep opening the real estate available ( in terms of time, learning ability , broadened horizon ) to us to stay relevant and alert.

Wave length Division Multiplexing ( WDM ) is the technique to transmit multiple beams of different wavelengths down the same glass fibre at the same time and performs much better than fibre optics cable .
One can argue similarly that having different wavelengths of speciality ( depth vs. Broad knowledge ) in multiple subjects travelling in the fibres of your mind is better than having only few beams running and under utilising the capabilities and the possibilities .

Constant evolution of skills is the steady drumbeat in the background of personal success . Our brains are well equipped to learn and outstrip our intuition about our own personal capabilities and limitations.
The brain learns at an exponential rate and the exponential learning leads to staggeringly big results – ones that further fuel the curve of exponential learnings .

The other challenge we face is that now Moore’ law also applies to the set of what’s there to be learnt . Recombinant technologies , open source innovations , better algorithms and API lead thought process along with the marvels of AI , machine learning , voice processing , distributed ledgers , digital gears – and in general everything is evolving faster than it was in the last decade.
All relevant and true examples of exponential improvement trajectories of Moore’s law.

One way of fast learning is to apply SLAM ( Simultaneous localisation and Mapping ) which is basically a very fascinating problem that AI researches are solving .

SLAM is the process of building up a map of an unfamiliar territory ( read skill in our case ) as you are navigating through it – where are the things that I need to be aware of and what can trip me over.
Applying SLAM at a Micro level to your career and the skills that you have acquired and need to acquire can be an interesting exercise . And then at a micro level while learning a skill to have a quick bit size learning experience.

Finally quoting Albert Bartlett – “The greatest shortcoming of the human race is our inability to understand the exponential function”

Target Date Funds

Reference :- A excellent article on explaining target date funds ( TDF )

A target date fund is a mix of several different types of stocks, bonds and other investments. Technically, a portfolio manager uses what’s called a “glidepath” to adjust the underlying mix of investments that make up your target date fund.
Think of a glidepath as an investment roadmap. It helps determine what your risk exposure in your target date fund should be over the course of your career.

There’s a reason many retirement plans choose target date funds as their default investment: They make it easy for people looking to save for retirement to potentially maximize their future retirement income.

A traditional target date fund might follow a set asset allocation or it might be designed to outperform a certain benchmark (like the S&P Target Date Indices, for example).
A goals-based target date fund, on the other hand, takes a different approach. As the name suggests, these types of funds aim to achieve specific outcomes. Maybe you want to minimize volatility. Perhaps you want to strive to have consistent income in retirement. With goals-based target date funds, the funds are built differently to help you reach your specific goal. In practice, this means customizing the investment strategy – by managing risk, broadening the scope of investments, etc. – to help increase your chances of reaching your goal.

TDFs helps in Diversifaction and managing inflation

TDF explained on

Design Thinking

My first reaction as I learnt design thinking was – ” aren’t we talking about common sense here” . Seemed the whole paradigm of design thinking is just another hype from self appointed consultants , experts and self proclaimed practitioners.

But more I read and slowly starting to practice it – I realised that it gives a disciplined approach to one’s thought process and more than often it’s beyond common sense .

Mainly used to accelerate innovation and growth , let me explain the key 4 stages that one goes through using Design thinking are :-

What is – what is the current state & Reality , Attending to the present , unarticulated needs

What if – what if anything were possible , brainstorm without worrying about constraints , recombinant creations called business concepts are created

What Wows – testing , hypothesis of business concepts – make hard choices and find the wow zone

What works – Trying prototypes with fast feedback cycles , refine and iterate

Design thinking is a problem solving approach that asks four questions ( mentioned above ) and that is human-centered, possibility-driven, option-focused, and iterative in its approach. Let ‘s think about this :-

Human-centered :- for whom we want to generate value , innovate and hence user driven design . Need to engage them and do deep exploration of what the needs are

Possibility-driven :- It uses this information we’ve learned to ask the question, what if anything, were possible? As we begin to create new ideas about how to serve them.

Options :- generating multiple option, multiple solution basket.

Iterative :- Iterate your way to success by conducting cycles of real world experiments rather than running analysis using historical data.

Behavioural Finance


Behavioural Finance
A Different Perspective – Individual and Institutional

Traditional Finance is based upon NeoClassical Economics where risk aversion , utility maximisation and rationality are the key words to describe the investors . Also markets are assumed to be efficient and it is assumed that prices incorporate and reflect all available and relevant information.
Behavioural finance applies Psychology to finance and attempts to understand and explain observed investor and market behaviour .
Behavioural finance Micro (BFMI ) questions rationality and decision making process of individuals
Behavioural Finance Macro ( BFMA ) questions efficiency of markets and considers market anomalies that distinguish markets from the efficient markets of traditional finance

Key Differences have been summarised in the link below 

Behavioural Finance Continue reading “Behavioural Finance”