Download as pdf or txt
Download as pdf or txt
You are on page 1of 67

Financial Machina: Machine Learning

For Finance: The Quintessential


Compendium for Python Machine
Learning For 2024 & Beyond Sampson
Visit to download the full and correct content document:
https://ebookmass.com/product/financial-machina-machine-learning-for-finance-the-q
uintessential-compendium-for-python-machine-learning-for-2024-beyond-sampson/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Machine Learning in Microservices: Productionizing


microservices architecture for machine learning
solutions Abouahmed

https://ebookmass.com/product/machine-learning-in-microservices-
productionizing-microservices-architecture-for-machine-learning-
solutions-abouahmed/

Machine Learning for Time Series Forecasting with


Python Francesca Lazzeri

https://ebookmass.com/product/machine-learning-for-time-series-
forecasting-with-python-francesca-lazzeri/

Machine Learning for Beginners Aldrich Hill

https://ebookmass.com/product/machine-learning-for-beginners-
aldrich-hill/

Adversarial Robustness for Machine Learning Chen

https://ebookmass.com/product/adversarial-robustness-for-machine-
learning-chen/
Machine Learning Guide for Oil and Gas Using Python
Hoss Belyadi

https://ebookmass.com/product/machine-learning-guide-for-oil-and-
gas-using-python-hoss-belyadi/

Machine Learning for Planetary Science Joern Helbert

https://ebookmass.com/product/machine-learning-for-planetary-
science-joern-helbert/

Automated Machine Learning for Business R. Larsen

https://ebookmass.com/product/automated-machine-learning-for-
business-r-larsen/

Automated Machine Learning for Business Kai R Larsen

https://ebookmass.com/product/automated-machine-learning-for-
business-kai-r-larsen/

Machine Learning for Healthcare Applications Sachi


Nandan Mohanty

https://ebookmass.com/product/machine-learning-for-healthcare-
applications-sachi-nandan-mohanty/
FINANCIAL MACHINA
Machine Learning for Finance

Johann Strauss
Hayden Van Der Post
Vincent Bisette

Reactive Publishing
"The Ghost in the Machine"
To my daughter, may she know anything is possible.
"Machine learning: a silent architect of futures unseen, sculpting
wisdom from the clay of data, in a world where understanding
evolves with each pattern revealed."

JOHANN STRAUSS
CONTENTS

Title Page
Dedication
Epigraph
Introduction
Chapter 1: Foundations of Machine Learning in Finance
1.1 The Evolution of Quantitative Finance
1.2 Key Financial Concepts for Data Scientists
1.3 Statistical Foundations
1.4 Essentials of Machine Learning Algorithms
1.5 Data Management in Finance
Chapter 2: Machine Learning Tools and Technologies
2.1 Computational Environments for Financial Analysis
2.2 Data Exploration and Visualization Tools
2.3 Feature Selection and Model Building
2.4 Machine Learning Frameworks and Libraries
2.5 Model Deployment and Monitoring
Chapter 3: Deep Learning for Financial Analysis
3.1 Neural Networks and Finance
3.2 Convolutional Neural Networks (CNNs)
3.3 Recurrent Neural Networks (RNNs) and LSTMs
3.4 Reinforcement Learning for Trading
3.5 Generative Models and Anomaly Detection
Chapter 4: Time Series Analysis and Forecasting
4.1 Fundamental Time Series Concepts
4.2 Advanced Time Series Methods
4.3 Machine Learning for Time Series Data
4.4 Forecasting for Financial Decision Making
4.5 Evaluation and Validation of Forecasting Models
Chapter 5: Risk Management with Machine Learning
5.1 Credit Risk Modeling
5.2 Market Risk Analysis
5.3 Liquidity Risk and Algorithmic Trading
5.4 Operational Risk Management
Chapter 6: Portfolio Optimization with Machine Learning
6.1 Review of Modern Portfolio Theory
6.2 Advanced Portfolio Construction Techniques
6.3 Machine Learning for Asset Allocation
6.4 Quantitative Trading Strategies
6.5 Portfolio Management and Performance Analysis
Chapter 7: Algorithmic Trading and High-Frequency Finance
7.1 Introduction to Algorithmic Trading
7.2 Strategy Design and Backtesting
7.3 High-Frequency Trading Algorithms
Chapter 8: Alternative Data
8.1 Structured and Unstructured Data Fusion
8.2 Alternative Data in Portfolio Management
Chapter 9: Financial Fraud Detection and Prevention with Machine
Learning
9.1 Understanding Financial Fraud
9.2 Feature Engineering for Fraud Detection
9.3 Machine Learning Models for Fraud Detection
9.4 Real-Time Fraud Detection Systems
Conclusion
Epilogue: Navigating Future Frontiers from Berlin
Additional Resources
Glossary of Terms
Afterword
INTRODUCTION

P
aris, known for its art, culture, and innovation, is currently witnessing
a financial revolution comparable to an artistic renaissance. Advanced
machine learning is at the forefront of this transformative era,
reshaping the way we comprehend data and fundamentally changing the
rules of the finance industry. This revolution spans various aspects,
including the interpretation of intricate market dynamics, automation of
intricate trading strategies, management of diverse investment portfolios,
and evaluation of nuanced credit risks. The impact of this wave of
innovation is both continuous and significant.

The impact of machine learning in finance extends far beyond mere market
analysis. The realm of trading, once a stronghold of seasoned financial
experts, is now being revolutionized by automation. Sophisticated trading
algorithms are executing intricate strategies with a speed and precision that
far surpass human capabilities. These automated systems are not just faster;
they operate continuously, exploiting opportunities that arise outside the
conventional trading hours.

Welcome, esteemed reader, to "Financial Machina,” a guide crafted in the


spirit of Paris’s tradition of enlightenment and intellectual curiosity. This
book is your beacon in the complex confluence of finance and machine
learning, offering a synthesis of knowledge designed for those eager to
master the inner workings of the modern financial landscape.
Our journey will transport you beyond the traditional realms of finance,
banking, and investment. You will discover the role of algorithms capable
of processing vast amounts of data and extracting valuable insights. We will
intricately navigate through the rich tapestry of predictive analytics, deep
learning, and reinforcement learning strategies, all of which are redefining
financial models and investment methodologies.

As your guide, we begin with the foundational concepts of machine


learning, ensuring a robust understanding of both its statistical backbone
and computational power. We will then venture into more complex areas—
deciphering patterns in unstructured data, optimizing algorithmic trading
systems, and interpreting signals amidst market noise—always linking
theoretical knowledge with practical application.

Our narrative includes case studies and real-world applications, shedding


light on the intersection of theory and financial challenges. You'll witness
the transformative impact of advanced machine learning in areas like risk
management, fraud detection, and portfolio optimization. We will also delve
into the latest advancements and ethical considerations, preparing you to
harness and responsibly direct the formidable power of machine learning in
finance.

By the conclusion of this journey, you will have a comprehensive view of


the current financial landscape as shaped by machine learning, equipped to
anticipate and navigate its future developments.

This intellectual voyage offers enlightenment and essential insights. It


encourages the embrace of interdisciplinary collaboration and urges
curiosity-driven exploration into the cutting edge of financial innovation.
The knowledge presented here extends beyond a mere glimpse into the
future, serving as a blueprint for present actions and as a manual for
trailblazers who have the potential to shape the financial landscape for
generations to come.

So, engage the intellect, ignite your ambition, and as you turn this page,
begin your ascent to the pinnacle of one of the most exciting and
transformative applications of advanced machine learning. Welcome to our
comprehensive guide—the journey starts here.

Warm Regards,
Vincent Bissette
CHAPTER 1:
FOUNDATIONS OF
MACHINE LEARNING IN
FINANCE
1.1 THE EVOLUTION OF
QUANTITATIVE FINANCE

I
n the brisk, electrified air of the early morning, a trader in Vancouver
gazes upon the flickering screens, a mosaic of numbers casting an
ethereal glow across the austere lines of his face. Here begins our tale of
quantitative finance, a saga of transformation that stretches from the ledgers
of antiquity to the algorithmic ballets of today's markets.

Once the preserve of the erudite economist and the calculating bookkeeper,
finance has metamorphosed, courtesy of the digital revolution, into a realm
where the quantitative analyst reigns supreme. The narrative of this
evolution is one of ceaseless innovation, a relentless quest for precision in
an unpredictable world.

In the nascent days of quantitative finance, the tools were simple, the
calculations manual. Theories of risk and return were pondered over ink
and paper, through the lens of traditional economics. Yet, as the march of
technology advanced, so too did the sophistication of financial strategies.

The 1950s saw the advent of Modern Portfolio Theory (MPT), proposed by
Harry Markowitz, which shifted the gaze of finance towards the
mathematical domains of variance and covariance. This period of
enlightenment presented a new frontier; one in which the portfolio's risk
was as integral as its return.
As the decades unfurled, the Efficient Market Hypothesis (EMH) emerged,
championed by the likes of Eugene Fama, challenging the notion that one
could consistently outperform market averages. EMH argued for a market's
perfect clairvoyance, where prices reflected all known information, leaving
no room for excess gain through analysis alone.

It was in the 1970s that the Black-Scholes-Merton model further cemented


quantitative finance as a discipline of high repute. This model delivered an
analytical closed-form solution for the pricing of European options, a feat
that revolutionized derivative markets and sowed the seeds for
computational finance.

Yet, the limitations of these early models, their assumptions of market


behavior, and the normalcy of data distribution became increasingly
apparent. The financial crises that rippled through the global economy laid
bare the shortcomings of traditional quantitative methods. It was clear: the
finance world needed a more adaptable, more nuanced toolbox.

Enter the era of machine learning, a renaissance of sorts for quantitative


finance. The finesse of neural networks, the adaptability of ensemble
methods, and the prescience of reinforcement learning began to redraw the
boundaries of what was possible. Financial modeling was no longer
constrained by the rigidity of old assumptions; it was now a dynamic and
predictive craft, honing in on patterns within vast and unruly oceans of data.

The evolution of quantitative finance has been both a technical journey and
a philosophical one. As the discipline continues to evolve, it incor porates
lessons from behavioral economics, recognizing the irrational quirks of
human decision-making and market movements. It is a continuing tale, one
of complexity and change, where the only constant is the relentless pursuit
of deeper understanding and greater predictive power.

This historical perspective begins in a time where statistical methods were


the backbone of financial analysis. The bell curve reigned, encapsulating
the symmetry of market returns and the hope that past data could reliably
forecast future trends. This era of Gaussian dominance was marked by a
steadfast belief in the power of linear regression, t-tests, and the
foundational principles of hypothesis testing.

Yet, the financial markets, with their tumultuous ebbs and flows, resembled
not the calm predictability of a Gaussian world but rather the wild
undulations of the Pacific Ocean, viewed from the rugged coasts of
Vancouver Island. The Black Monday crash of 1987 was a stark reminder of
this incongruence, a day when markets plummeted and the bell curve fell
short, failing to capture the fat tails and extreme events that characterize
financial returns.

The limitations of traditional statistics—its assumptions of linearity,


normality, and homoscedasticity—were becoming glaringly evident. It was
not enough to simply describe the central tendencies of data; the need to
predict and adapt to ever-shifting market conditions called for a new
analytical paradigm.

Enter the age of machine learning—a field that promised to transcend the
limitations of classical statistics. No longer were financial analysts confined
to the linearity of regression models. They now had at their disposal
decision trees that branched out with market complexity, support vector
machines that carved hyperplanes through the multi-dimensional space of
financial instruments, and neural networks that learned and adapted like the
human brain.

Machine learning introduced a newfound agility to financial analysis.


Encompassing both supervised and unsupervised learning paradigms, it
allowed analysts to uncover hidden patterns and relationships within the
data. These algorithms thrived on the chaotic abundance of market data,
teasing out signals from the noise, learning from the data, and evolving with
it.

Moreover, the advent of these sophisticated techniques coincided with an


explosion of computing power and data availability. Massive datasets—
once the exclusive purview of institutions like the Vancouver Stock
Exchange—became accessible to a broader community of quants and data
scientists, propelling the field forward at a breakneck pace.
This section paints a picture of a discipline in constant flux, one that mirrors
the organic complexity of nature itself. It is a tale of innovation driven by
both necessity and possibility, where each breakthrough in machine
learning opens new doors for finance and each financial challenge spurs
further advancements in algorithmic understanding.

As machine learning continues to redefine the boundaries of what's


achievable in finance, this historical perspective serves as a reminder that
the field's future will be shaped by those who not only grasp the
mathematical intricacies of these tools but also possess the creativity and
vision to apply them in novel and ethically responsible ways. This section,
therefore, is not just an overview of the past; it's a springboard into the
future, a call to action for those who wish to be at the forefront of the next
financial revolution.

1.1.2 Influential Financial Models and Their Limitations

There once was a widespread reverence for the classical financial models
that shaped decades of investment strategies. These models were the
stalwarts of finance, the theoretical constructs that sought to distill the
chaotic marketplace into understandable equations and predictable
outcomes.

Chief among these influential models was the Capital Asset Pricing Model
(CAPM), which posited a linear relationship between the expected return of
an asset and its risk relative to the market. The simplicity and elegance of
CAPM made it a cornerstone of financial theory, introducing the concept of
beta as a measure of systematic risk and offering insights into the pricing of
risk and the construction of an efficient portfolio.

Following in the intellectual lineage of CAPM, the Efficient Market


Hypothesis (EMH) emerged, championing the idea that stock prices fully
reflect all available information. According to EMH, no amount of analysis
—fundamental or technical—could consistently yield returns above the
market average because price changes were the result of unforeseen events,
rendering markets inherently unpredictable.
The Fama-French Three-Factor Model extended the CAPM framework by
including size and value factors in addition to market risk, thus providing a
more nuanced view of what drives asset returns. This model became a
bedrock for empirical asset pricing studies, heralding a shift towards
multifactor explanations of returns that acknowledged the market's
complexity.

Despite the intellectual triumphs of these models, the limitations inherent in


their assumptions became increasingly apparent. CAPM's assumption of a
single factor (market risk) governing returns was too simplistic to capture
the multifaceted nature of risk. EMH's assertion of market efficiency
clashed with the psychological and behavioral anomalies observed by
practitioners and academics alike—phenomena that would later be
encapsulated by the field of behavioral finance.

Furthermore, these models were largely predicated on historical data,


which, as any seasoned trader at the Pacific Exchange would attest, is a
precarious foundation for future predictions. The tumultuous nature of
financial markets, with their abrupt shifts and black swan events, laid bare
the folly of relying on static models in a dynamic world.

The limitations of these traditional financial models catalyzed the search for
more adaptive and data-driven approaches. Machine learning, with its
capacity to learn from and evolve with data, began to assert its potential as a
transformative force in finance. As the industry grappled with the
shortcomings of established models, it became clear that a new era of data-
centric and algorithmically sophisticated models was on the horizon.

Introduction of Machine Learning in Finance

Machine learning's promise in finance lies in its inherent capacity to


uncover patterns within vast datasets—patterns too complex or subtle for
traditional statistical models to detect. This evolving field leverages
computational algorithms that adaptively improve their performance as they
are exposed to more data, a feature particularly suited to the fluid and
voluminous nature of financial information.
The transition towards machine learning was not abrupt; it was a gradual
awakening. Pioneers in the field began by applying fundamental techniques
such as linear regression to financial forecasting, only to discover that these
methods could be vastly enhanced through machine learning's nuanced
approaches. Decision trees, for example, enabled analysts to map out the
non-linear decision paths that more accurately represented financial
scenarios. Meanwhile, support vector machines offered robust classification
capabilities, proving to be powerful tools for pattern recognition in market
data.

One of the early heralds of machine learning's potential was algorithmic


trading, where automated processes could execute trades at a speed and
frequency unattainable by human traders. These algorithms were initially
straightforward, following set rules based on technical indicators. However,
as machine learning models grew more sophisticated, they began to
incorporate a variety of signals, including historical price data, news
articles, and social media sentiment, to make more informed trading
decisions.

The financial sector's burgeoning interest in machine learning also led to


advancements in risk assessment and management. Traditional risk models
often fell short in predicting extreme events, but machine learning's
predictive power brought new depth to the analysis of potential risks,
enabling institutions to react more swiftly and effectively to signs of market
stress.

Ensemble learning, a technique that combines multiple models to improve


predictive performance, began to revolutionize credit scoring. By
aggregating the insights of various classifiers, financial institutions could
generate more accurate and granular assessments of creditworthiness than
ever before—a boon for both lenders and borrowers.

Yet, with all its potential, the adoption of machine learning in finance was
met with challenges. The black box nature of certain algorithms,
particularly those in deep learning, raised concerns about interpretability
and trust. Financial institutions, bound by regulations and the need for
transparency, grappled with balancing the performance of these models
against the requirement to explain their decision-making processes.

Moreover, machine learning models are only as good as the data they are
trained on. Issues such as overfitting, where models perform exceptionally
well on historical data but fail to generalize to unseen data, became a focal
point of attention. Data quality, privacy, and the ethical use of machine
learning also became topics of heated discussion within the financial
community.

1.1.4 Overview of Financial Markets and Instruments

At the heart of Financial markets lise equities, representing ownership


shares in public companies. The stock exchanges where these shares are
traded, from the New York Stock Exchange to the Tokyo Stock Exchange,
serve as barometers of economic health, reacting instantaneously to the
pulse of news, earnings reports, and investor sentiment. Equities are just
one component of a much broader ecosystem that includes bonds,
commodities, currencies, derivatives, and more.

The bond market, often seen as the more temperate sibling to the volatile
equities market, deals in fixed-income securities. It is a haven for investors
seeking steady returns, but it also plays a crucial role in the functioning of
the economy by allowing governments and corporations to borrow funds.
Bonds range from the ultra-secure government-issued treasuries to high-
yield junk bonds, each offering a different level of risk and return.

Commodities markets trade in physical goods such as precious metals, oil,


and agricultural products. These markets are of primal economic
importance, and their fluctuations can ripple through to every corner of the
globe, influencing inflation, currency exchange rates, and even geopolitical
dynamics. The pricing of commodities involves a complex interplay of
supply and demand, production costs, and macroeconomic factors.

Currency markets, or the foreign exchange markets, are immense and fluid,
with trillions of dollars exchanged daily. Currencies are traded in pairs,
reflecting the interconnected nature of global trade and finance. Exchange
rates fluctuate continuously, impacted by interest rate differentials,
economic data, and global events. The forex market is a testament to the
interconnectedness of the world's economies, where a policy shift in one
nation can send ripples across the globe.

Derivatives, including futures, options, and swaps, are financial contracts


whose values are derived from underlying assets. They serve various
purposes, from hedging against price movements to speculative ventures.
The derivatives market is complex and powerful, capable of both mitigating
risk and, as history has shown, exacerbating financial crises when used
imprudently.

Each of these markets operates in a web of regulations and technological


infrastructures that ensure liquidity, transparency, and fairness. Modern
trading platforms, powered by advanced algorithms and machine learning
models, allow for the rapid execution of trades and sophisticated analysis of
market conditions. The growing influence of algorithmic trading has
brought about both increased efficiency and new challenges, such as the
potential for flash crashes caused by automated trading errors.

1.1.5 Ethical Considerations and Bias in Financial Modeling

Financial modeling is not a value-neutral science. The models we build


often reflect the values of their creators, whether explicitly or implicitly. As
such, ethical considerations must be at the forefront of model development,
guiding the choices we make—from data selection to algorithmic design.
Ethical modeling respects the principles of fairness, accountability, and
transparency, seeking to mitigate harm while enhancing the common good.

Bias, a deviation from the standard of impartiality, can be insidious,


creeping into models through various channels. Data bias emerges when the
historical data used to train algorithms contains prejudicial elements,
leading to skewed or discriminatory outcomes. Algorithmic bias can occur
when the models themselves process data in ways that reinforce stereotypes
or systemic inequalities. Confirmation bias, the tendency to favor
information that confirms existing beliefs, can cloud the judgment of
analysts, influencing the very premises upon which models are built.
Consider the impact of biased credit scoring models, which might
systematically disadvantage certain demographic groups, or trading
algorithms that inadvertently exacerbate market inequality. Such outcomes
are not merely technical glitches but ethical failings with tangible
consequences for individuals and society.

Addressing these concerns starts with the acknowledgment of the inherent


biases that all data and models carry. It requires the rigorous examination of
data sources, constant validation against fresh, unbiased datasets, and the
willingness to challenge and refine our assumptions. Machine learning
practitioners must be vigilant, ensuring that their models do not perpetuate
or amplify existing biases, but rather work towards neutralizing them.

Moreover, models should be transparent and explainable. Stakeholders must


be able to understand how decisions are made, what data informs them, and
the potential limitations at play. Transparency promotes trust and allows for
the scrutiny necessary to identify and correct ethical breaches.

Ethics in financial modeling also extends to privacy concerns. The


aggregation and analysis of vast amounts of personal financial data raise
questions about consent and the proper stewardship of sensitive
information. Data scientists have a duty to safeguard this data, ensuring that
privacy is not sacrificed on the altar of analytical prowess.

The implementation of ethical AI frameworks and adherence to regulatory


guidelines, such as GDPR in Europe, help to formalize the ethical
considerations that must be embedded in financial modeling. These
frameworks encourage accountability, mandating that institutions can
justify the outcomes of their automated decision-making processes.

In the evolving landscape of financial modeling, where machine learning


brings both power and complexity, it is incumbent upon us to wield these
tools responsibly. As we continue to explore the applications of machine
learning in finance, let us do so with a commitment to ethical integrity,
ensuring that the financial models of tomorrow are built not only with
sophistication but also with a deep sense of social responsibility.
As we turn our attention to the following section, we'll explore the key
financial concepts that data scientists must grasp to create models that are
not only powerful and predictive but also equitable and just. Through an
ethically grounded approach to machine learning, we can aspire to a
financial ecosystem that is reflective of our highest ideals and aligned with
a more equitable and prosperous society for all.
1.2 KEY FINANCIAL
CONCEPTS FOR DATA
SCIENTISTS

T
he time value of money is an essential cornerstone of financial theory,
underpinning many of the models used in investment and risk
assessment. It reflects the premise that a dollar today is worth more
than a dollar tomorrow due to its potential earning capacity. Data scientists
must not only grasp this concept but be adept at applying it through
discounting future cash flows and understanding the implications for
present value calculations.

Financial statements are the bedrock upon which the edifice of corporate
finance is erected. To analyze a company's performance and potential for
investment, one must unravel the complexities of the balance sheet, income
statement, and cash flow statement. A data scientist skilled in financial
statement analysis can identify trends, assess financial health, and spot
anomalies that may signal errors or even fraud.

Risk and return are inextricably linked in the financial markets. The concept
of risk pertains to the uncertainty of returns and the likelihood of
investment outcomes deviating from expectations. Return is the gain or loss
on an investment over a specified period. Understanding the trade-offs
between risk and potential returns is vital for creating robust financial
models that can withstand the caprices of the markets.
Basic portfolio theory, pioneered by Harry Markowitz, posits that
diversification can reduce the risk of a portfolio without diminishing
expected returns. The theory suggests that by combining assets with varying
risk profiles, one can craft a portfolio that minimizes overall volatility. Data
scientists must comprehend the mechanics of correlation and the
quantification of risk to effectively apply machine learning to portfolio
optimization.

Behavioral finance adds a layer of psychological complexity to the


landscape, challenging the traditional assumption that markets are rational.
Insights from behavioral finance reveal that cognitive biases and emotional
responses can significantly influence investor behavior. Integrating these
insights into machine learning models can enhance their predictive capacity,
enabling a more nuanced understanding of market dynamics.

Grounding machine learning in these fundamental financial concepts


provides a sturdy platform from which to launch more sophisticated
analytical endeavors. The mastery of these principles equips data scientists
with the necessary tools to craft models that are not only technically
proficient but also deeply attuned to the financial domain's unique rhythms
and nuances.

As we venture forth into the statistical foundations that underpin predictive


modeling, let us carry with us the knowledge that finance is as much an art
as it is a science. A harmonious blend of quantitative rigor and qualitative
insight is the hallmark of any seasoned financial analyst or data scientist.
Through the thoughtful integration of key financial concepts, we pave the
way for machine learning models that can illuminate the shadows of
uncertainty and guide decision-making in the complex dance of financial
markets.

1.2.1 Time value of money principles

TVM is predicated on the axiom that money available now is more valuable
than the same amount in the future due to its potential earning capacity.
This principle is the bedrock upon which the empire of compound interest
is built. It's the concept that informs investors when they assess the viability
of pouring funds into a new venture or when a family decides to save for
their child's education.

To elucidate the time value of money, consider a simple Python code


snippet that computes the future value of a single sum:

```python
def future_value(present_value, annual_rate, periods_per_year, years):
# Calculate the future value after a given number of years
rate_per_period = annual_rate / periods_per_year
periods = periods_per_year * years

return present_value * (1 + rate_per_period) periods

# Example usage:
present_value = 1000 # Present value in dollars
annual_rate = 0.05 # Annual interest rate as a decimal
periods_per_year = 12 # Monthly compounding
years = 5 # Number of years to calculate

fv = future_value(present_value, annual_rate, periods_per_year, years)


print(f"The future value of the investment is: ${fv:.2f}")
```

Using such code, a data scientist can swiftly calculate the future worth of
present-day investments. Knowing this allows for sound economic
planning, whether it be in personal finance, corporate investment strategies,
or government fiscal policies.

Discounted cash flow (DCF) analysis, a technique that applies TVM to


assess investment opportunities, is a potent tool in a financial analyst’s
armory. It enables analysts to determine the present value of expected future
cash flows, factoring in a discount rate that encapsulates the risk and
opportunity cost of tying up capital.

Let's illustrate with an example in Excel, often the data scientist's


companion in financial analysis. Imagine you're evaluating a series of cash
flows expected from a project over the next five years. Using the DCF
formula in Excel, =NPV(discount_rate, range_of_cash_flows), you can
effortlessly bring future dollars into today's terms, laying bare the project's
true value.

TVM also reaches into the domain of annuities and perpetuities—concepts


that shape retirement planning and the pricing of financial instruments like
bonds. The ability to calculate the present or future value of these financial
streams is a quintessential skill for data scientists working with financial
models.

Mastering the time value of money principles unlocks a deeper


understanding of interest rates, inflation, and the psychology of investing.
It's a concept that permeates the financial fabric of societies, echoing in the
corridors of banks, investment firms, and universities.

As we dive further into the nuances of finance through the lens of machine
learning, we carry the time value of money with us. It is a fundamental truth
that resonates through all subsequent concepts, a thread that weaves through
the narrative of finance with unwavering constancy. It empowers data
scientists to build predictive models that are not just reflections of data
patterns but are also imbued with the time-honored wisdom of financial
theory.

1.2.2 Financial statement analysis for data scientists

Financial statements are the cornerstone documents that encapsulate a


company's fiscal health and operational efficiency. They consist of the
balance sheet, income statement, and cash flow statement, each serving as a
window into various aspects of the company’s financial state.
The balance sheet is akin to a snapshot, providing a momentary glimpse of
a company's assets, liabilities, and shareholders' equity. It is a reflection of
what the company owns and owes, a ledger of its financial standing at a
point in time. For the data scientist, the balance sheet is a treasure trove,
ripe for analysis and predictive modeling.

An income statement, meanwhile, flows like the narrative of a novel,


detailing the company’s revenues, expenses, and profits over a period. It
tells the unfolding story of a company's ability to generate earnings as it
operates. Data scientists can dive into this narrative, employing machine
learning algorithms to discern patterns and predict future performance.

The cash flow statement narrates the tale of liquidity, charting the inflows
and outflows of cash. It is the lifeblood of an organization, revealing how
well it manages its cash to fund operations, pay debts, and make
investments. Analyzing cash flows through statistical models enables data
scientists to forecast a company's ability to sustain operations and grow.

To illustrate, let's consider a practical example in which a data scientist


utilizes Python to analyze a company's financial ratios—quantitative
indicators derived from financial statement data:

```python
import pandas as pd

# Assume we have a DataFrame 'financials' with financial statement data


financials = pd.DataFrame({
'Total Assets': [1000000],
'Total Liabilities': [500000],
'Shareholders Equity': [500000],
'Net Income': [150000],
'Revenue': [500000],
'Operating Cash Flow': [200000]
})
# Calculate some key financial ratios
financials['Current Ratio'] = financials['Total Assets'] / financials['Total
Liabilities']
financials['Debt to Equity Ratio'] = financials['Total Liabilities'] /
financials['Shareholders Equity']
financials['Return on Equity'] = financials['Net Income'] /
financials['Shareholders Equity']
financials['Profit Margin'] = financials['Net Income'] / financials['Revenue']
financials['Operating Cash Flow to Revenue'] = financials['Operating Cash
Flow'] / financials['Revenue']

print(financials)
```

By employing such analyses, a data scientist can identify the financial


strengths and weaknesses of an enterprise, discern trends over time, and
predict future solvency and profitability.

In Excel, financial statement analysis might revolve around constructing


formulas to compute these ratios across historical data. A data scientist with
a command of Excel's advanced functions, such as VLOOKUP, PIVOT
TABLEs, and CONDITIONAL FORMATTING, can produce
comprehensive dashboards that provide a visual representation of a
company's fiscal health.

In the vast sea of data, the financial statements serve as the lighthouse,
guiding data scientists toward informed conclusions and impactful insights.
By harnessing the power of machine learning and computational tools, the
modern data scientist can elevate the time-tested practices of financial
analysis to new heights, revealing patterns unseen by the traditional
analyst's eye.

The analysis of financial statements is not simply about number-crunching;


it is about telling the story of a company's past and predicting the narrative
of its future. As we continue to explore the confluence of machine learning
and finance, the role of financial statement analysis stands as a testament to
the power of data in shaping the financial strategies of tomorrow.

1.2.3 Risk and Return Fundamentals

In the financial universe, risk and return are the yin and yang, the
fundamental forces shaping the investment landscape. Understanding this
dynamic duo is paramount for data scientists weaving predictive tapestries
in the world of finance. They are two sides of the same coin, the essence of
every investment decision, and the core of financial strategy.

The concept of risk in finance refers to the probability that an investment's


actual return will differ from the expected return, and encompasses the
potential for both upside and downside fluctuations. Return, on the other
hand, is the reward for bearing risk—the greater the risk, the higher the
expected return should be. This principle is the gravitational pull that keeps
the orbits of financial instruments in check.

To delve deeper, let's demystify the concept of variance—a statistical


measure of dispersion that quantifies risk in the world of investments:

```python
import numpy as np

# Suppose we have an array of historical returns for a particular stock


stock_returns = np.array([0.12, 0.08, 0.06, -0.02, 0.07])

# Calculate the average return (mean)


average_return = np.mean(stock_returns)

# Calculate the variance of returns


variance = np.var(stock_returns)

print(f"Average Return: {average_return}")


print(f"Variance: {variance}")
```

Variance captures the essence of risk; it tells us how much the returns of a
stock are spread out around the mean. A high variance indicates a high level
of risk, as the investment's returns are more unpredictable.

In the context of portfolio management, data scientists leverage the concept


of diversification—a risk management technique that mixes a wide variety
of investments within a portfolio. The rationale is straightforward: different
asset classes often move in dissimilar directions, so when one asset
experiences volatility, others may remain stable or even rise, thus reducing
the overall risk.

A practical Excel exercise could involve calculating the correlation between


different assets to inform diversification strategies. Correlation measures
the degree to which two securities move in relation to each other:

```excel
=CORREL(array1, array2)
```

A correlation close to 1 implies that the assets move in the same direction,
while a correlation close to -1 indicates that they move in opposite
directions.

But risk is not a monolith; it is a mosaic, with varying types that data
scientists must dissect. Credit risk pertains to a borrower's potential default
on a loan, market risk arises from fluctuations in market prices, and
liquidity risk involves the inability to execute a transaction at the prevailing
market price.

In the same breath, the expected return of an investment is not a simple


arithmetic mean. It is a probabilistic expectation based on various factors,
including the risk-free rate (the return of an investment with no risk of
financial loss), the beta (a measure of an asset's volatility in relation to the
overall market), and the equity risk premium (the extra return above the
risk-free rate demanded by investors for taking on the additional risk
associated with equities).

One could envision a Python script that employs the Capital Asset Pricing
Model (CAPM) to calculate expected returns:

```python
def calculate_expected_return(risk_free_rate, beta, market_return):
return risk_free_rate + beta * (market_return - risk_free_rate)

# Example values
risk_free_rate = 0.03 # 3%
beta = 1.2
market_return = 0.10 # 10%

expected_return = calculate_expected_return(risk_free_rate, beta,


market_return)

print(f"Expected Return: {expected_return}")


```

This code snippet provides a framework for understanding how different


factors influence the expected return on an asset.

The saga of risk and return is age-old, but data science breathes new life
into it. Through the power of algorithms, machine learning, and robust
statistical tools, today's data scientists are equipped to analyze and model
risk and return with unprecedented precision. Yet, as they draw insights
from vast datasets and construct sophisticated models, they must not lose
sight of the human element—the investors whose fortunes and futures are
influenced by these very models.

Risk and return are not only mathematical constructs; they are the blood
and bones of financial decision-making. Understanding these principles is
not a mere intellectual exercise—it is a foundational imperative for any data
scientist aspiring to make a mark in the financial domain.

1.2.4 Basic Portfolio Theory

At its most elemental level, Basic Portfolio Theory is the study of how
investors can construct portfolios to optimize or maximize expected return
based on a given level of market risk, emphasizing that risk is an inherent
part of higher reward. It is the veritable backbone of strategic asset
allocation and provides a systematic approach to the decision-making
process for investment portfolios.

Originating from Harry Markowitz's pioneering work in 1952, the theory


introduces the concept of diversification to reduce unsystematic risk. By
selecting a variety of asset classes that correlate imperfectly with one
another, investors can construct a portfolio that offers the potential for
lower volatility and more stable returns.

To apply these ideas, a data scientist might employ Python to simulate


different portfolio combinations and evaluate their potential risks and
returns. Here’s a simplified example using Monte Carlo simulation to
visualize possible risk-return profiles of various portfolio combinations:

```python
import numpy as np
import matplotlib.pyplot as plt

# Let's assume we have two assets with their expected returns and standard
deviation
asset1_return, asset1_std = 0.10, 0.15
asset2_return, asset2_std = 0.12, 0.20

# Correlation coefficient between the assets


rho = 0.5
# Number of portfolios to simulate
num_portfolios = 10000

# Arrays to store the simulated portfolio returns and risks


port_returns = []
port_risks = []

for _ in range(num_portfolios):
# Randomly assign weights to the assets for each portfolio
weights = np.random.random(2)
weights /= np.sum(weights)

# Expected portfolio return


port_return = weights[0] * asset1_return + weights[1] * asset2_return
port_returns.append(port_return)

# Portfolio risk (standard deviation)


port_risk = np.sqrt((weights[0]*asset1_std)2 + (weights[1]*asset2_std)2
+
2*weights[0]*weights[1]*asset1_std*asset2_std*rho)
port_risks.append(port_risk)

# Convert lists to arrays


port_returns = np.array(port_returns)
port_risks = np.array(port_risks)

# Plot the risk-return profiles of the simulated portfolios


plt.scatter(port_risks, port_returns, c=port_returns/port_risks, marker='o')
plt.title('Portfolio Optimization Simulation')
plt.xlabel('Portfolio Risk (Standard Deviation)')
plt.ylabel('Portfolio Return')
plt.colorbar(label='Sharpe Ratio')
plt.show()
```

This visualization aids in identifying the 'efficient frontier'—the set of


optimal portfolios that offer the highest expected return for a defined level
of risk or the lowest risk for a given level of expected return.

A practical approach using Microsoft Excel might involve the use of the
`SOLVER` tool to optimize the weight distribution of assets in a portfolio to
maximize the Sharpe ratio, which is a measure of risk-adjusted return. The
SOLVER can adjust the allocation weights to find the portfolio with the
highest Sharpe ratio, subject to the constraints of the weights summing up
to 1 and any other investor-imposed constraints.

Basic Portfolio Theory also introduces the distinction between systematic


and unsystematic risk. Systematic risk, or market risk, affects all
investments and is considered un-diversifiable. Unsystematic risk, or
specific risk, is unique to a particular company or industry and can be
mitigated through diversification.

The application of machine learning in portfolio theory could extend to


pattern recognition in historical data to forecast future returns and
covariances, facilitate algorithmic rebalancing, and even tailor personalized
investment strategies to individual risk preferences.

1.2.5 Behavioral Finance Insights for Modelers

The essence of behavioral finance is the recognition that investors are not
always rational, markets are not always efficient, and that cognitive biases
and emotions can significantly influence investment decisions. This
deviation from the expected utility maximization and the efficient market
hypothesis presents a fertile ground for data scientists and modelers to
explore new dimensions of financial analysis.
For instance, one might explore the phenomenon of overconfidence, where
investors overestimate the precision of their knowledge or predictions. A
Python-based machine learning model could analyze historical trading data
to identify patterns that suggest overconfidence, such as excessive trading
volume or insufficient diversification.

Here’s an example of how one might implement a basic analysis of trading


behavior in Python:

```python
import pandas as pd
import numpy as np

# Assuming we have a DataFrame `trades` with columns for 'investor_id',


'trade_volume', and 'diversification'
trades = pd.read_csv('investor_trading_data.csv')

# Calculate average trade volume and diversification index for each


investor
investor_stats = trades.groupby('investor_id').agg({'trade_volume': 'mean',
'diversification': 'mean'})

# Define thresholds for overconfidence indicators


high_volume_threshold = investor_stats['trade_volume'].quantile(0.9)
low_diversification_threshold =
investor_stats['diversification'].quantile(0.1)

# Identify potentially overconfident investors


overconfident_investors = investor_stats[(investor_stats['trade_volume'] >
high_volume_threshold) &
(investor_stats['diversification'] <
low_diversification_threshold)]

print(overconfident_investors)
```

This rudimentary analysis could serve as a starting point for further


investigation into the behavioral tendencies of investors.

In the realm of Microsoft Excel, a financial modeler could conduct similar


analyses using Excel functions and pivot tables to sort and filter data,
creating descriptive statistics that highlight potential behavioral biases
within a dataset.

Behavioral finance also considers heuristics, simple, efficient rules—either


learned or hard-wired into the brain—that have been evolved to make
decisions quicker and easier. These mental shortcuts, however, can lead to
systematic deviances from logic, probability, or rational choice theory.

For the data scientist, incorporating behavioral finance insights into


financial modeling means acknowledging and accounting for these biases.
Machine learning algorithms, particularly classification and clustering
algorithms, can be employed to segment investors according to their
behavioral patterns, such as herding, anchoring, or aversion to loss.

The inclusion of sentiment analysis, using techniques such as natural


language processing (NLP) on financial news, blogs, or social media, can
further enrich financial models. By quantifying the mood or subjective tone
of market discourse, data scientists can integrate a more nuanced picture of
market dynamics.

Consider an example where we utilize a sentiment analysis library in


Python to evaluate the sentiment of financial news headlines:

```python
from textblob import TextBlob
import pandas as pd

# Load financial news headlines


news_headlines = pd.read_csv('financial_news_headlines.csv')
# Calculate sentiment polarity for each headline
news_headlines['sentiment'] = news_headlines['headline'].apply(lambda x:
TextBlob(x).sentiment.polarity)

# Explore the distribution of sentiment polarity


print(news_headlines['sentiment'].describe())
```

Models that incorporate these insights can potentially provide an edge by


anticipating market movements driven by investor psychology, rather than
solely by fundamental indicators.
1.3 STATISTICAL
FOUNDATIONS

B
eneath the pulsating heart of modern finance lies an intricate vascular
system of statistical theories and practices. This foundational bedrock
of quantitative analysis permits the systematic dissection of financial
data, offering profound insights into the probabilistic machinations of
markets and the behavior of investment returns.

In this section, we dive into the statistical underpinnings that sustain


machine learning models, beginning with the core principles of statistical
inference. This involves the process of drawing conclusions about
populations or scientific truths from data. To illustrate, consider a financial
analyst who seeks to infer the average annual return of a market index. By
employing Python's statistical libraries, the analyst could execute a script
akin to the following:

```python
import numpy as np
import scipy.stats as stats

# Assuming 'annual_returns' is a NumPy array of yearly returns for a


market index
annual_returns = np.array([...])

# Calculate the sample mean and standard deviation


sample_mean = np.mean(annual_returns)
sample_std = np.std(annual_returns, ddof=1)

# Construct a 95% confidence interval for the mean annual return


confidence_interval = stats.t.interval(0.95, len(annual_returns)-1,
loc=sample_mean, scale=stats.sem(annual_returns))

print(f"The 95% confidence interval for the mean annual return is:
{confidence_interval}")
```

This code snippet exemplifies how statistical inference can provide a range
within which the true mean annual return likely lies, equipping investors
with a more informed perspective.

Another cornerstone of statistical foundations is probability distribution


analysis. In finance, different types of distributions are used to model the
behavior of asset returns. For example, while stock returns are often
assumed to follow a normal distribution, this assumption can be naive.
Heavy tails and skewness are inherent to financial data, and models such as
the Student's t-distribution can more accurately reflect these characteristics.
In practice, financial modelers would perform tests for normality and adapt
their models accordingly, as demonstrated here:

```python
import scipy.stats as stats

# Test for normality of returns


k2, p = stats.normaltest(annual_returns)
alpha = 1e-3

print(f"p = {p}")

# Null hypothesis: the sample comes from a normal distribution


if p < alpha:
print("The null hypothesis can be rejected. The data may not be
normally distributed.")
else:
print("The null hypothesis cannot be rejected. The data may be normally
distributed.")
```

Furthermore, the exploration of time series analysis is indispensable in


finance. Financial variables, such as stock prices and interest rates, are often
serially correlated and exhibit volatility clustering. Techniques like
Autoregressive Integrated Moving Average (ARIMA) models or
Generalized Autoregressive Conditional Heteroskedasticity (GARCH)
models are tailored to capture these temporal dependencies and volatilities
in time series data.

As we consider regression analysis and hypothesis testing, we face the


challenge of discerning relationships between variables. Are certain
financial metrics predictive of stock performance? Does the introduction of
a new policy affect market volatility? To answer such questions, regression
models are employed, and hypothesis tests are conducted to determine the
statistical significance of the results.

Lastly, the concepts of overfitting and underfitting are addressed.


Overfitting occurs when a model learns not only the underlying signal in
the training data but also its noise, leading to poor generalization to unseen
data. Underfitting, conversely, happens when a model is too simplistic to
capture the underlying structure. Both conditions are detrimental to model
performance, and techniques such as cross-validation, regularization, and
model selection criteria are crucial to prevent them.

1.3.1 Probabilistic frameworks and inference

Probability theory, the mathematical backbone of probabilistic frameworks,


serves as the fundamental building block for machine learning algorithms
applied in finance. It endows models with the ability to manage uncertainty
and make informed predictions. Through the lens of finance, these
frameworks are employed to assess risks, to forecast market trends, and to
estimate probabilities of financial events, such as defaults or stock price
movements.

Inference, a critical component of probabilistic frameworks, is the means by


which data scientists derive broader implications from data samples. The
leap from observed data to general conclusions necessitates the careful
construction of probabilistic models that can account for randomness and
uncertainty. For instance, Bayesian inference provides a structured way of
updating beliefs in the presence of new evidence. The Bayesian approach
incorporates prior beliefs about a model's parameters and updates these
beliefs as new data becomes available.

Consider an example where a data scientist aims to infer the probability


distribution of asset returns based on historical data. They might select a
Bayesian approach to incorporate both the data and any prior beliefs about
market conditions. A Python code snippet for this process could resemble
the following:

```python
import pymc3 as pm

# Historical return data for an asset


historical_returns = np.array([...])

# Constructing a Bayesian model


with pm.Model() as model:
# Prior distribution for the unknown mean return
mu = pm.Normal('mu', mu=0, sd=1)

# Prior distribution for the unknown standard deviation


sigma = pm.HalfNormal('sigma', sd=1)
# Likelihood (sampling distribution) of observations
returns = pm.Normal('returns', mu=mu, sd=sigma,
observed=historical_returns)

# Posterior distribution sampling


trace = pm.sample(1000)

pm.plot_posterior(trace)
```

In the above snippet, PyMC3 is used to define a model with normal priors
for the mean and standard deviation of asset returns. The 'pm.Normal' and
'pm.HalfNormal' specify the prior beliefs about these parameters. Then, the
observed data is incorporated into the model through the 'returns' variable.
Finally, the 'pm.sample' method draws samples from the posterior
distribution, which reflects the updated beliefs after considering the data.
The 'pm.plot_posterior' function visualizes the resulting distributions,
providing insights into the estimated parameters.

Inference also extends to frequentist methods, where tools such as


confidence intervals and hypothesis tests are employed without reliance on
prior distributions. These methods provide another avenue for drawing
conclusions from financial data, often through the construction of p-values
and test statistics to refute or support hypotheses about market behavior.

The application of probabilistic frameworks in finance is further


exemplified by the use of Monte Carlo simulations. These simulations rely
on the power of randomness to forecast outcomes under a variety of
scenarios. For instance, in assessing the risk of a portfolio, a Monte Carlo
simulation might generate thousands of potential future asset prices and
calculate the resulting portfolio values, thereby estimating the distribution
of potential outcomes and the probability of incurring losses.

As we transition from probabilistic frameworks to the more detailed


concepts of distributions and time series analysis, it is essential to carry
forward the understanding that these frameworks are not just mathematical
abstractions. They are, in fact, the very tools that enable financial
professionals to navigate the uncertain waters of the market with poise and
rigor. Through the application of these frameworks, the veil of uncertainty
is lifted, revealing a clearer picture of the financial future, much like the
clearing of clouds over Vancouver's skyline after a rainstorm, promising
new possibilities and insights.

Distributions and their importance in finance

In the domain of machine learning for finance, understanding the nature and
implications of different distributions is paramount. It is these distributions
that describe the range of possible values that a random variable can assume
and the probability of each value occurring. This information is crucial
when modeling financial phenomena, such as asset returns, interest rates, or
currency exchange rates.

To appreciate the central role of distributions in finance, consider the


normal distribution, often referred to as the Gaussian distribution. Its bell-
shaped curve is ubiquitous, representing the idealized distribution of returns
for many financial instruments under normal market conditions. Yet finance
professionals are keenly aware that "fat tails"—the occurrence of extreme
events with higher than expected probability—are a common feature of
financial return distributions. This understanding has profound implications
for risk management and has led to the exploration of alternative
distributions, such as the Student's t-distribution, which accommodates
these heavy tails.

With Python, financial analysts can visualize and model these distributions
to gain insights into the nature of financial data. Below is a Python code
snippet illustrating how to model asset returns using the normal and
Student's t-distributions with the SciPy library:

```python
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm, t
Another random document with
no related content on Scribd:
VI
OF THE BIRTH AND DEATH OF THE DRAGON

The chief satisfaction which learned men appear to derive from


these tales is quarrelling about their common or separate origin. The
Separatists say that their resemblances merely show how very much
alike men are, the world over; the Communists that they are so very
intricate and so far from obvious that they must have sprung from a
common stock (cf. Mr. W. J. Perry’s and Professor Elliot Smith’s
theories as to the common—Egyptian—origin of militarism, mining,
and many other branches of megalithic and modern culture).
Personally I am a Communist; for it is a perfectly good principle,
common to science and theology, that miracles are not to be
multiplied, beyond necessity. The question is, in any case, of no
fundamental importance to us, but it will simplify what follows if I
make my standpoint plain.
When our first fathers found themselves at large in this already
ancient world, the first fact they noticed was that they were alive.
Like all their descendants after them, they wisely worshipped facts,
and they made a religion of fertility; like us too, and like all those who
will follow us, they knew nothing certain of the two infinities from
which we come and to which we go, before birth and after death.
The next fact they noticed was that other men died, though their
minds shrank in horror from the fact that they too must die, and could
not entertain it. They hankered after immortality, for their dear ones
and (later) for themselves, as we hanker after it, and as our children
will; for in course of time it became a commonplace of all the world
that all men must die, and this doom of the “sad-eyed race of mortal
men” is the theme of pathos throughout antiquity. Their souls
rebelled against the bitterness of death, and the search for the elixir
of life (to renew man’s youth and to give him immortality) has been
“the inspiration of most of the world’s great literature in every age
and clime, and not only of our literature but of all our civilization.”
They worshipped life, and feared and hated death. And so they
worshipped women, and the womb from which they all sprang. For
good luck they carried amulets, shells especially; and from being
amulets these shells came to be worshipped as the actual source of
life, were personified and made symbols again of the Great Mother,
the giver of life. (So Aphrodite, the goddess of love, came floating in
a shell on the foam of the sea to gladden the hearts of men). They
noticed, too, that water was the first necessity of men and beasts
and plants, and that dead men and things stiffened and withered as
though the water was gone out of them; and so they worshipped
water as the principle of life, and the water-god was the second-
born. Then, turning their vision further a-field, they took note of the
regular motions of the moon, her monthly course, and her strange
connection with the tides of the sea; and so the Great Mother
became identified with the Moon. And then as they pondered they
felt the greater glory of the Sun, and set him up above his mother the
Moon; but the moon long remained the personification of order and
light and goodness, set over against chaos and darkness and evil—
though in time it was the sun, or his successor-sun, who came to be
regarded as the prince of light.
They hated death, and in the presence of it protested their belief
that somehow, somewhere, the dead continued to live, needing all
the gifts his family could bring—a primitive doctrine of immortality.
And then, in the presence of corruption, they made plans to preserve
the body: they burnt incense to restore the odour of life; they poured
libations to replace the vital juices. They tried to infuse blood, the life-
giver (for “blood,” as we say still, “is thicker than water”) or to find
some painted substitute. They hung the tomb with magic shells, that
the dead might be born again. And when, after all, the body still
decayed, they made statues instead for the soul to inhabit, and tried
their charms on them; and from the idea that statues can come to life
grows the contrary idea that men can be turned to stone. The
crowning triumph of their statuary was the eye, making the statue (as
we say) “a living image”; and from the idea that the open eye means
life, came the belief in the power of the eye for good or evil. To this
day the neglect of the poorest grave is regarded as a more than
callous crime, and there are not wanting those amongst us who
shudder at the desecration of the age-old tombs of Egyptian kings.
Thus it was in the beginning. And when in process of time a wise
king discovered the arts of irrigation (it may be that this discovery
made him king; or perhaps kingship originated with the discovery of
the calendar, which conferred the gift of prophecy: “king” here is in
any case premature), and spread fertility throughout the land, they
worshipped him too and made him a living god, and cherished him
as the soul (as it were) of their land’s fertility. And when he grew old,
and his powers began to wane, terror fell on them lest their fortunes
should fail with him and they be all dead men. So they transferred
their worship for the king to his office, killed him, and made his son
divine. And when he too began to age, they killed him in turn, and his
sons after him, so that they always had a young and vigorous king-
god. Until in time an ageing king refused to submit, and this was the
origin of the story of the wrath of the gods and the destruction of
mankind. Time passed, and the monarch was replaced by a maiden
among his subjects, and we are at the stage of ordinary human
sacrifice, “human blood being thought of as the only elixir.” But in
time that, too, was ended, by a kind of religious reformation, through
the belief that any other blood would do as well; and this was the
origin of the story of the rescued maiden and her deliverer.
They worshipped water, and they worshipped shells, and so the
pearl within the oyster-shell; and diving for pearls, their natural
enemy was the shark, the guardian of the treasure and the only true
and original dragon. But in the course of ages all this was naturally
forgotten, and the dragon came to be adorned with all the terrors of
all the monsters of travellers’ tales, from the python to the octopus
and the lion that lives in the waste. Any terrible or impressive fact of
life or nature—the existence of evil, or of hoary mountains—gave
rise to a fresh dragon-tale; and the fact was then brought in as
evidence of the truth of the tale, very much as a politician to-day will
convince men of his general veracity and wisdom by stating some
obvious truth; and in the absence of facts, the vague terrors of
untutored minds became embodied in similar monsters; and so in a
sense they are still, though nowadays we call the result a “complex.”
EPILOGUE
I would not wish man rid of the dragon as death; partly, no doubt,
because I know it to be impossible (“This business of death is a plain
case and admits no controversy”); partly because death is such a
satisfactory thing: it is always something to look forward to. Death is
perhaps the oldest of the dragons, long since domesticated and
become the friend of man through familiarity.
But there remains that dragon of which we spoke in the beginning,
compounded of respectability and bigotry and cant; or rather these
things are the evidence that the dragon still exists, for they are all the
effects of terror: terror of truth and knowledge and hard fact, the old
terror of man “a stranger and afraid, in a world he never made.” This
monster dwells not in the desert places of the earth, but in the hearth
and home of every man. Its appetite is enormous and its destructive
powers are equalled only by its fertility. Like all the other dragons, it
is begotten by dogma out of ignorance.
It would be a mistake to suppose (as some have done) that
religion is altogether a bad thing because it has fostered many
errors, or altogether a fraud because it is profitable to priests. Every
science under the sun has fostered innumerable errors, and every
doctor on earth practises pious frauds daily, seldom solely for his
private ends. Mankind as a whole has had a hand in these
imaginings for half-a-hundred centuries; our certain knowledge of our
surroundings is to this day infinitesimal; and “it is part of our human
make-up to bridge the gaps in our experience with rumours, with
conjectures, and with soothing traditions.”
Not many months ago there came to these shores a Chinese
game, Mah Jongg, so perfected in the course of centuries that not
even a Chinaman can cheat at it. Is it too much to hope that, with the
general increase of knowledge and the general recognition of the
limits to which our knowledge can attain, this old world may yet
produce some saint or hero who will finally rescue Andromeda from
the dragon?

Transcriber’s Notes:
Printer’s, punctuation, and spelling inaccuracies were silently
corrected.
Archaic and variable spelling has been preserved.
*** END OF THE PROJECT GUTENBERG EBOOK PERSEUS; OR,
OF DRAGONS ***

Updated editions will replace the previous one—the old editions


will be renamed.

Creating the works from print editions not protected by U.S.


copyright law means that no one owns a United States copyright
in these works, so the Foundation (and you!) can copy and
distribute it in the United States without permission and without
paying copyright royalties. Special rules, set forth in the General
Terms of Use part of this license, apply to copying and
distributing Project Gutenberg™ electronic works to protect the
PROJECT GUTENBERG™ concept and trademark. Project
Gutenberg is a registered trademark, and may not be used if
you charge for an eBook, except by following the terms of the
trademark license, including paying royalties for use of the
Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such
as creation of derivative works, reports, performances and
research. Project Gutenberg eBooks may be modified and
printed and given away—you may do practically ANYTHING in
the United States with eBooks not protected by U.S. copyright
law. Redistribution is subject to the trademark license, especially
commercial redistribution.

START: FULL LICENSE


THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the


free distribution of electronic works, by using or distributing this
work (or any other work associated in any way with the phrase
“Project Gutenberg”), you agree to comply with all the terms of
the Full Project Gutenberg™ License available with this file or
online at www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand,
agree to and accept all the terms of this license and intellectual
property (trademark/copyright) agreement. If you do not agree to
abide by all the terms of this agreement, you must cease using
and return or destroy all copies of Project Gutenberg™
electronic works in your possession. If you paid a fee for
obtaining a copy of or access to a Project Gutenberg™
electronic work and you do not agree to be bound by the terms
of this agreement, you may obtain a refund from the person or
entity to whom you paid the fee as set forth in paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only


be used on or associated in any way with an electronic work by
people who agree to be bound by the terms of this agreement.
There are a few things that you can do with most Project
Gutenberg™ electronic works even without complying with the
full terms of this agreement. See paragraph 1.C below. There
are a lot of things you can do with Project Gutenberg™
electronic works if you follow the terms of this agreement and
help preserve free future access to Project Gutenberg™
electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright
law in the United States and you are located in the United
States, we do not claim a right to prevent you from copying,
distributing, performing, displaying or creating derivative works
based on the work as long as all references to Project
Gutenberg are removed. Of course, we hope that you will
support the Project Gutenberg™ mission of promoting free
access to electronic works by freely sharing Project
Gutenberg™ works in compliance with the terms of this
agreement for keeping the Project Gutenberg™ name
associated with the work. You can easily comply with the terms
of this agreement by keeping this work in the same format with
its attached full Project Gutenberg™ License when you share it
without charge with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.

1.E. Unless you have removed all references to Project


Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project
Gutenberg™ work (any work on which the phrase “Project
Gutenberg” appears, or with which the phrase “Project
Gutenberg” is associated) is accessed, displayed, performed,
viewed, copied or distributed:

This eBook is for the use of anyone anywhere in the United


States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it
away or re-use it under the terms of the Project Gutenberg
License included with this eBook or online at
www.gutenberg.org. If you are not located in the United
States, you will have to check the laws of the country where
you are located before using this eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is


derived from texts not protected by U.S. copyright law (does not
contain a notice indicating that it is posted with permission of the
copyright holder), the work can be copied and distributed to
anyone in the United States without paying any fees or charges.
If you are redistributing or providing access to a work with the
phrase “Project Gutenberg” associated with or appearing on the
work, you must comply either with the requirements of
paragraphs 1.E.1 through 1.E.7 or obtain permission for the use
of the work and the Project Gutenberg™ trademark as set forth
in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is


posted with the permission of the copyright holder, your use and
distribution must comply with both paragraphs 1.E.1 through
1.E.7 and any additional terms imposed by the copyright holder.
Additional terms will be linked to the Project Gutenberg™
License for all works posted with the permission of the copyright
holder found at the beginning of this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files
containing a part of this work or any other work associated with
Project Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute
this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or


providing access to or distributing Project Gutenberg™
electronic works provided that:

• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt that
s/he does not agree to the terms of the full Project Gutenberg™
License. You must require such a user to return or destroy all
copies of the works possessed in a physical medium and
discontinue all use of and all access to other copies of Project
Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project


Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite
these efforts, Project Gutenberg™ electronic works, and the
medium on which they may be stored, may contain “Defects,”
such as, but not limited to, incomplete, inaccurate or corrupt
data, transcription errors, a copyright or other intellectual
property infringement, a defective or damaged disk or other
medium, a computer virus, or computer codes that damage or
cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES -


Except for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU
AGREE THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE,
STRICT LIABILITY, BREACH OF WARRANTY OR BREACH
OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE
TRADEMARK OWNER, AND ANY DISTRIBUTOR UNDER
THIS AGREEMENT WILL NOT BE LIABLE TO YOU FOR
ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE
OR INCIDENTAL DAMAGES EVEN IF YOU GIVE NOTICE OF
THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If


you discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person or
entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.

1.F.4. Except for the limited right of replacement or refund set


forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR
ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the


Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you do
or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.

Section 2. Information about the Mission of


Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.

Section 3. Information about the Project


Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status by
the Internal Revenue Service. The Foundation’s EIN or federal
tax identification number is 64-6221541. Contributions to the
Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.

The Foundation’s business office is located at 809 North 1500


West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact

Section 4. Information about Donations to


the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.

The Foundation is committed to complying with the laws


regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or
determine the status of compliance for any particular state visit
www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states


where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot


make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.

Please check the Project Gutenberg web pages for current


donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.

Section 5. General Information About Project


Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.

Project Gutenberg™ eBooks are often created from several


printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.

You might also like