Reinforcement Learning - Assignment 2

07/11/2023, 13:05 HW Assignment 2
Salman Yousaf salmany@uchicago.edu

In [1]: import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import display, HTML

import gym
from gym import spaces
import matplotlib.pyplot as plt
Question 1
In [2]: total_steps = 100
n = 8
Pmat = np.array([[90.81,8.33,0.68,0.06,0.08,0.02,0.01,0.01],
[0.7,90.65,7.79,0.64,0.06,0.13,0.02,0.01],
[0.09,2.27,91.05,5.52,0.74,0.26,0.01,0.06],
[0.02,0.33,5.95,85.93,5.3,1.17,1.12,0.18],
[0.03,0.14,0.67,7.73,80.53,8.84,1,1.06],
[0.01,0.11,0.24,0.43,6.48,83.46,4.07,5.2],
[0.21,0,0.22,1.3,2.38,11.24,64.86,19.79],
[0,0,0,0,0,0,0,100],
],dtype=float) / 100
P = np.zeros((total_steps,n,n),dtype=np.float64)
P[0] = Pmat
for t in range(1,total_steps):
P[t] = np.matmul(P[t-1],Pmat)
ratings = ['AAA', 'AA', 'A', 'BBB', 'BB', 'B', 'CCC', 'D']

for i, from_rating in enumerate(ratings):
plt.figure(figsize=(10,8))
for j, to_rating in enumerate(ratings):
plt.plot(P[:,i,j], label = f"{from_rating} -> {to_rating}")
plt.xlim(0,100)
plt.ylim(0,1.2)
plt.title(f'N-Step Transition Probabilities for {from_rating}')
plt.xlabel('Step n')
plt.ylabel('Probability')
plt.legend()
file:///Users/salmanyousaf/Downloads/HW Assignment 2 (2).html 1/10

07/11/2023, 13:05 HW Assignment 2

07/11/2023, 13:05 HW Assignment 2

07/11/2023, 13:05 HW Assignment 2

07/11/2023, 13:05 HW Assignment 2
Question 2
The Markov chain has two classes.
07/11/2023, 13:05 HW Assignment 2
The first class contains the states "AAA", "AA", "A", "BBB", "BB", "B", and "CCC".
These states can all transition to each other.
The second class contains the single state "D". Once the chain enters state "D", it
cannot leave.
In other words, the first class is a group of states that the chain can move between
freely. The second class is a single state that the chain can only enter once and then
cannot leave.
Question 3
If a Markov chain has an absorbing state (a state that, once entered, cannot be left),
the periodicity of the chain is 1, making the chain aperiodic.
Question 4
In [3]: np.random.seed(1234)
states = ['AAA']
transition_lh = 1.0
summary = pd.DataFrame(columns=['Step','State_i','State_j','Transition Pr
for t in range(total_steps):
next_state = np.random.choice(range(n), p = Pmat[ratings.index(states
states.append(ratings[next_state])
transition_lh = Pmat[ratings.index(states[-2]),ratings.index(states[-
summary = pd.concat([summary, pd.DataFrame({'Step':[t+1],'State_i':[s
'Transition Probability':
In [4]: plt.figure(figsize=(10,5))
plt.step(range(total_steps+1),states)
plt.title('Bond Transition Simulation with Initial AAA Rating')
plt.xlabel('Time (t)')
plt.ylabel('State')
Out[4]: Text(0, 0.5, 'State')
Likelihood of each transition, and entire simulated sequence:

07/11/2023, 13:05 HW Assignment 2
In [5]: display(summary)
Step State_i State_j Transition Probability

0 1 AAA AAA 0.9081
1 2 AAA AAA 0.9081
2 3 AAA AAA 0.9081
3 4 AAA AAA 0.9081
4 5 AAA AAA 0.9081
... ... ... ... ...
95 96 BB BB 0.8053
96 97 BB BB 0.8053
97 98 BB BB 0.8053
98 99 BB BBB 0.0773
99 100 BBB BBB 0.8593
100 rows × 4 columns
In [6]: print(f"Sequence Likelihood: {np.product(summary['Transition Probability'
Sequence Likelihood: 7.571112140228321e-21
Question 5
In [8]: Q = Pmat[:-1,:-1]
I = np.identity(Q.shape[0])
N = np.linalg.inv(I - Q)
title = "Expected No. of Transitions Prior to Entering Absorbing State"

display(HTML(f'<h4>{title}</h4>'))
display(pd.DataFrame(N, index=ratings[:-1], columns=ratings[:-1]))
Expected No. of Transitions Prior to Entering Absorbing State

AAA AA A BBB BB B CCC
AAA 12.888602 20.423473 32.921814 19.773631 9.880736 8.693628 1.942995
AA 2.096882 21.566386 33.163148 19.873529 9.878730 8.700411 1.944556
A 1.273665 10.541247 34.796554 20.045532 9.945452 8.686041 1.944228
BBB 0.891828 7.125979 21.863100 21.915782 9.931234 8.610521 1.988953
BB 0.616925 4.709273 14.029035 13.026152 12.483263 9.095669 1.830750
B 0.373842 2.744633 7.972364 7.147212 6.179523 11.156707 1.699788
CCC 0.279353 1.648535 4.723658 4.222820 3.310801 4.609543 3.610820
Question 6
In [13]: # Define the submatrix H of transition probabilities between transient st
H = Pmat[0:(n-1),0:(n-1)]

07/11/2023, 13:05 HW Assignment 2
# Compute the fundamental matrix Z

Z = np.linalg.inv(np.identity(H.shape[0])-H)
# Initialize an empty DataFrame to store the results

results = pd.DataFrame(index=ratings[0:-1], columns=ratings[0:-1])
# Loop over each state i

for i in range(n-1):
# Loop over each state j
for j in range(n-1):
if i == j:
# For the same state, use the original formula
Q_ij = (Z[i, j] - 1) / Z[j, j]
else:
# For different states, use the modified formula
Q_ij = Z[i, j] / Z[j, j]
# Store the results in the DataFrame

results.iloc[i, j] = Q_ij
# Print the results

print(results)
# Create a heatmap from the DataFrame

sns.heatmap(results.astype(float), annot=True, cmap='YlGnBu')
# Show the plot

plt.show()
AAA AA A BBB BB B CC
C
AAA 0.922412 0.947005 0.946123 0.902255 0.791519 0.779229 0.53810
4
AA 0.162693 0.953632 0.953058 0.906814 0.791358 0.779837 0.53853
6
A 0.098821 0.488781 0.971262 0.914662 0.796703 0.778549 0.53844
5
BBB 0.069195 0.330421 0.628312 0.954371 0.795564 0.77178 0.55083
1
BB 0.047866 0.218362 0.403173 0.594373 0.919893 0.815265 0.50701
8
B 0.029006 0.127264 0.229114 0.326122 0.495025 0.910368 0.47074
9
CCC 0.021674 0.07644 0.135751 0.192684 0.265219 0.413163 0.72305
5

07/11/2023, 13:05 HW Assignment 2
The Markov chain is irreducible, suggesting that it is possible to get from any state to
any other state in the Markov chain.
Question 7
In [14]: # Define the number of states and periods
N = 8
T = 5
# Initialize the arrays

f = np.zeros((T,N,N), dtype=np.float64)
Pbar = np.zeros((N,N,N), dtype=np.float64)
# Loop over each state

for j in range(N):
for i in range(N):
for k in range(N):
if k != j:
Pbar[j,i,k] = Pmat[i,k]
else:
Pbar[j,i,k] = 0
# Loop over each state and period

for j in range(N):
for t in range(T):
if t == 0:
f[t,:,j] = Pmat[:,j]
else:
f[t,:,j] = np.matmul(Pbar[j,:,:], f[t-1,:,j])
# Calculate the probabilities of reaching AAA and CCC within 5 periods

prob_AAA = np.sum(f[:T,:,0], axis=0)
prob_CCC = np.sum(f[:T,:,6], axis=0)

07/11/2023, 13:05 HW Assignment 2
# Print the probabilities

print("Probability of reaching AAA rating within 5 periods:")
for i, label in enumerate(ratings):
print(f"From {label}: {prob_AAA[i]:.6f}")
Probability of reaching AAA rating within 5 periods:

From AAA: 0.910192
From AA: 0.029763
From A: 0.005267
From BBB: 0.001754
From BB: 0.001586
From B: 0.001124
From CCC: 0.005535
From D: 0.000000
If a bond currently has an ""AAA" rating, there’s a 91.02% chance it will maintain that
rating within 5 periods. If a bond currently has a "AA" rating, there’s only a 2.98%
chance it will upgrade to an "AAA" rating within 5 periods and so on.
In [15]: # Print the probabilities
print("\nProbability of reaching CCC rating within 5 periods:")
for i, label in enumerate(ratings):
print(f"From {label}: {prob_CCC[i]:.6f}")
Probability of reaching CCC rating within 5 periods:

From AAA: 0.000945
From AA: 0.002568
From A: 0.007545
From BBB: 0.051917
From BB: 0.066085
From B: 0.153521
From CCC: 0.665099
From D: 0.000000
If a bond currently has an “AAA” rating, there’s only a 0.09% chance it will downgrade
to a “CCC” rating within 5 periods. However, if a bond currently has a “CCC” rating,
there’s a 66.51% chance it will maintain that rating after 5 periods and so on.
In [16]: # Print fi,i for each state
for i in range(n):
print(f'fi,{ratings[i]}: {round(P[0][i,i],4)}')
fi,AAA: 0.9081
fi,AA: 0.9065
fi,A: 0.9105
fi,BBB: 0.8593
fi,BB: 0.8053
fi,B: 0.8346
fi,CCC: 0.6486
fi,D: 1.0
Confirms intuition.
In [ ]:

Reinforcement Learning - Assignment 2

Uploaded by

Copyright:

Available Formats

You might also like

Reinforcement Learning - Assignment 2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Reinforcement Learning - Assignment 2

Uploaded by

Copyright:

Available Formats

07/11/2023, 13:05 HW Assignment 2

Salman Yousaf salmany@uchicago.edu

from IPython.display import display, HTML

ratings = ['AAA', 'AA', 'A', 'BBB', 'BB', 'B', 'CCC', 'D']

file:///Users/salmanyousaf/Downloads/HW Assignment 2 (2).html 1/10

file:///Users/salmanyousaf/Downloads/HW Assignment 2 (2).html 2/10

file:///Users/salmanyousaf/Downloads/HW Assignment 2 (2).html 3/10

file:///Users/salmanyousaf/Downloads/HW Assignment 2 (2).html 4/10

Out[4]: Text(0, 0.5, 'State')

Likelihood of each transition, and entire simulated sequence:

Step State_i State_j Transition Probability

Sequence Likelihood: 7.571112140228321e-21

title = "Expected No. of Transitions Prior to Entering Absorbing State"

Expected No. of Transitions Prior to Entering Absorbing State

file:///Users/salmanyousaf/Downloads/HW Assignment 2 (2).html 7/10

# Compute the fundamental matrix Z

# Initialize an empty DataFrame to store the results

# Loop over each state i

# Store the results in the DataFrame

# Print the results

# Create a heatmap from the DataFrame

# Show the plot

file:///Users/salmanyousaf/Downloads/HW Assignment 2 (2).html 8/10

# Initialize the arrays

# Loop over each state

# Loop over each state and period

# Calculate the probabilities of reaching AAA and CCC within 5 periods

file:///Users/salmanyousaf/Downloads/HW Assignment 2 (2).html 9/10

# Print the probabilities

Probability of reaching AAA rating within 5 periods:

Probability of reaching CCC rating within 5 periods:

file:///Users/salmanyousaf/Downloads/HW Assignment 2 (2).html 10/10

You might also like