$CSCO trades at an all time high today

March 27, 2000. That was the previous all-time intraday trading high for Cisco stock until this morning. If you held on for 6,501 trading days, congratulations you are finally above water again. OK, well with dividends you are comfortably ahead, but just on headline price after 26 years $CSCO has attained an all-time intraday high.


What all happened in-between? Oh, just 9/11, wars in Afghanistan and Iraq, hurricane Katrina, the Bali tsunami (don’t remember that? It claimed 230,000 lives), the great financial recession, first black US president, Fukushima Nuclear disaster, storming of the US Capitol, global COVID-19 pandemic, land war in Europe, Facebook, iPod, Youtube, iPhone, Netflix (streaming), cloned animals, generative AI. And of course the dot-com bust that started it all, and happened with the largest company on Earth by market cap (CSCO) reaching $82 per share.

As for me, I was 29 years old – not even 30 yet. Our first child Austin was only 2 years old, probably just finishing up potty training. We had just moved into our home we had custom built in Moore, OK – 213 N Olde Bridge Road. I was so mad at myself for letting the builder put the house too close to the road.

What happened to Cisco stock post March 27, 2000? Well, it was pretty quick on the way down:

  • 70.75 on March 30, 2000 ( 10%+ lower, 3 trading days later)
  • 55.06 on April 14, 2000 (30%+ lower, 14 trading days)
  • 50.00 flat on May 22, 2000 (40% lower, 40 trading days)
  • 32.63 on Jan 2, 2001 (60% down, 160 trading days)
  • 15.00 flat on March 19, 2001 (80% lower, 1 year later)
  • 11.04 on September 27, 2001 (85% lower, 1.5 years later)
  • 8.12 on October 8, 2002 (90% wiped out, 2.5 years later)

So from $82 to $8. That October 2002 price was the all time low, during the GFC the price was in the high teens.

How about post Oct 2002 – what was the path on the way up?

  • 9.21 on October 8, 2002 (10%+ higher, intraday)
  • 11.00 flat on Oct 15, 2002 (30%+ higher, 5 trading days later)
  • 15.48 on November 21, 2002 (90% higher, 25 trading days)
  • 20.57 on Sept 3, 2003 (150% up, 1 year later)
  • 30.00 flat on July 16, 2007 (270% up, 5 years later)
  • 40.24 on Jan 10, 2018 (400% up, 16 years later)
  • 83.24 on Feb 3, 2026 (1,000% up, ¼ century later)

Interesting the early milestones on the way down are slightly longer (3 days, 14 days, 40 days) than the early milestones on the way up (intraday, 5 days, 25 days) – meaning the market gives you very little opportunity to sell the true highs and even less opportunity to buy the true lows.

The later milestones on the way down are much shorter (1 year, 1.5 years, 2.5 years) than the way back up (5 years, 16 years, 24 years). You have to have patience if you are investing (not trading). And sometimes if you don’t trade you are left with generationally lost money.

Here are some photos of yours truly back at the now previous zenith of the stock in March 2000

at the time this was taken $CSCO was $66 per share and I was 29 years old. My son today is 27.
I was working for Enterasys/Cableton at the time. We had an event with me playing Neo from the Matrix. I’m sure I owned the room in this outfit.
Our little family of 3

Actually though, Cisco is not exactly at an all-time high today. Back in March 2000 the market cap of CSCO was $500B. We spent so much money in stock buybacks and net retiring shares that now the market cap is $330B. But if you throw back in the dividends then yes, it is effectively greater today than it was in absolute terms (not inflation adjusted terms) in 2000. To get whole on an inflation adjusted basis (still accounting for dividends reinvested) we need to stock to trade at about $100/share. My money is we will get there soon. There is this AI thing that people are talking about… (Did I mention we actually used to own infiniband (Topspin) and we killed it, but it came back?) So many things happen in 26 years.

AI hallucinations and MNIST digits

Easy question: What digit (0-9) is this?


It was a “2”. Got that? Well, this should also be another easy question: What digit is this?

If you said, “it’s not a digit, it is the letter Q” you are wrong. Listen to the rules again: What digit (0-9) is this? The answer is of course 8, as seen here by the output of a 3 layer MLP neural network:

Not convinced? Well you should be. With enough training samples and epochs the neural net has great accuracy of reading handwritten digits.

OK, what? Back to the beginning – what are we even talking about?

A foundational piece of machine learning, really one the early tasks that can’t simply be programmed away, was recognizing handwritten digits. This work goes back to the 1990s and the first (artificial) convolutional neural networks. It is basically impossible to program in if/then statements to identify a number 2. You could write code that says if pixel places 620-630 are all greater intensity than 0.8, then you probably have a line on the bottom, hence a feature of a number 2. But obviously that does not scale or work with all the variability of how people write.

Take this handwritten “3”. How do we know it is a 3? Well, come to think of it, actually I do not. This particular piece of data was not labeled by the author. Another problem for another time.

MNIST focused on taking handwritten digits and converting them to 28×28 grayscale so machines could process them. So first, convert this to a 28×28 grayscale:

Now visualize that as a 28×28 matrix, each element between 0-255 (8-bit):

That is perfect for working with machines. Notice this can just be a 784 element column-vector with values 0-255 in each place. We build our neural network as follows:

784 neurons on the left, one for each pixel of the input. 10 neurons on the right, one for each output possibility (0-9). Sticking with our “3” that means input X151=63, X152=99,(those are the first nonzero pixel inputs you see in the matrix above) … straight to the end where pixel 784 or X784=0. The outputs should be [0 0 0 1 0 0 0 0 0 0] meaning we have 100% confidence the input is a “3” and not something else. Don’t worry about the AI black box magic right now, we’ll address that in a minute. Here is the actual output we get:

===============================================
SINGLE TEST SAMPLE SOFTMAX OUTPUT PROBABILITIES
===============================================
True label: n/a → Model predicted: 3
Confidence: 0.892576 (89.26%)
----------------------------------------------
Class Name Softmax Prob Bar
-----------------------------------------------
0 0 0.000000 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
1 1 0.000001 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
2 2 0.000001 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
3 3 0.892576 ████████████████████████████████████████ 89.26%
4 4 0.000000 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
5 5 0.091811 █████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 9.18%
6 6 0.000000 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
7 7 0.000000 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
8 8 0.014620 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 1.46%
9 9 0.000992 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.10%
-----------------------------------------------
PREDICTED CLASS: 3 (3) with 89.26% confidence
===============================================

A well-communicated written “3” There is still a 10% chance it is a 5 or 8, but that’s just 2 times through (also known as 2 epochs) the training set of 60,000 MNIST digits. As we go to 10 epochs and beyond the output neurons do go to 100% (or [0 0 0 1 0 0 0 0 0 0] more specifically). Here is how the model (python script you can pull from GitHub, link at the bottom) is invoked

./mnist_digits_10.py --train_classes d --test_classes d --test_samples 1 --epochs 10 --show_softmax_output_probabilities --test_seed 24 --train_samples 60000


Training on classes: ['d']
Testing on classes: ['d']
Using device: mps
Proportional dataset created:
  digit: 60000 samples (100.0%)
Loaded MNIST test dataset: 10000 samples
Using train_seed=42 for training data selection
Using test_seed=24 for test data selection
Using 60000 training samples and 1 test samples
Model created with 10 output classes
Starting training for 10 epochs
Epoch 1/10, Samples: 3200/60000, Loss: 1.2241
Epoch 1/10, Samples: 6400/60000, Loss: 0.473
  ...(skipping to the end of training output)...
Epoch 10/10, Samples: 51200/60000, Loss: .01
Epoch 10/10, Samples: 54400/60000, Loss: .01
Epoch 10/10, Samples: 57600/60000, Loss: .01
Training completed in 20.21 seconds
Average time per sample: 0.03 ms

==================================================
SINGLE TEST SAMPLE SOFTMAX OUTPUT PROBABILITIES
==================================================
True label: n/a → Model predicted: 3
Confidence: 1.000000 (100.00%)
--------------------------------------------------
Class    Name     Softmax Prob    Bar
--------------------------------------------------
0        0        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
1        1        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
2        2        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
3        3        1.000000    ████████████████████████████████████████ 100.00%
4        4        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
5        5        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
6        6        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
7        7        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
8        8        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
9        9        0.000000    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%
--------------------------------------------------
PREDICTED CLASS: 3 (3) with 100.00% confidence
==================================================


 

What if we train less? like instead of 60,000 training images and 10 epochs, how about 100 training images and 2 epochs (so it will only be exposed to 20 “3”s, 10 “3”s 2 times each)

./mnist_digits_10.py --train_classes d --test_classes d --test_samples 1 --epochs 2 --show_softmax_output_probabilities --test_seed 24 --train_samples 100
Training on classes: ['d']
Testing on classes: ['d']
Using train_seed=42 for training data selection
Using test_seed=24 for test data selection
Using 100 training samples and 1 test samples
Model created with 10 output classes
Starting training for 2 epochs
Training completed in 0.19 seconds
Average time per epoch: 0.09 seconds
Average time per sample: 0.94 ms

=================================================
SINGLE TEST SAMPLE SOFTMAX OUTPUT PROBABILITIES
==================================================
True label: 3 → Model predicted: 0
Confidence: 0.137974 (13.80%)
--------------------------------------------------
Class    Name     Softmax Prob    Bar
--------------------------------------------------
0        0        0.137974    ███████████████████████████████████████  13.80%
1        1        0.074135    ███████████████░░░░░░░░░░░░░░░░░░░░░░░░   7.41%
2        2        0.088325    █████████████████████░░░░░░░░░░░░░░░░░░   8.83%
3        3        0.113393    ██████████████████████████████░░░░░░░░░  11.34%
4        4        0.102612    ██████████████████████████░░░░░░░░░░░░░  10.26%
5        5        0.104179    ██████████████████████████░░░░░░░░░░░░░  10.42%
6        6        0.103599    ██████████████████████████░░░░░░░░░░░░░  10.36%
7        7        0.080653    ██████████████████░░░░░░░░░░░░░░░░░░░░░   8.07%
8        8        0.106269    ███████████████████████████░░░░░░░░░░░░  10.63%
9        9        0.088862    █████████████████████░░░░░░░░░░░░░░░░░░   8.89%
-------------------------------------------------
PREDICTED CLASS: 0 (0) with 13.80% confidence
=================================================

A lot worse. The model has not learned. And we have our first hallucination — we fed it a “3” and the AI said we have a “0”. That’s bad.

But what about our Q that the model said was an 8? From a hallucination standpoint, our fundamental limitation is that network only had 10 output neurons. No matter what it was fed, it had to output something between 0-9. Therefore, the “Q” became an “8”. Look at this: what digit is this?

You can see from the label on the top — truly this is a whitespace (ws). There is nothing there. Yet the MLP neural net predicted this was in fact a “5”. How close was it?

./mnist_digits_12.py --train_classes d --test_classes w --test_samples 1 --epochs 10 --show_softmax_output_probabilities --test_seed 327 --train_samples 10000 --visualize 5

True label: 11 → Model predicted: 5
Confidence: 0.228395 (22.84%)
--------------------------------------------------
Class    Name     Softmax Prob    Bar
--------------------------------------------------
0        0        0.062223    ██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   6.22%
1        1        0.137581    █████████████████████████░░░░░░░░░░░░░░░  13.76%
2        2        0.086253    ██████████████░░░░░░░░░░░░░░░░░░░░░░░░░░   8.63%
3        3        0.066895    ████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   6.69%
4        4        0.085063    █████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░   8.51%
5        5        0.228395    ████████████████████████████████████████  22.84%
6        6        0.110055    ████████████████████░░░░░░░░░░░░░░░░░░░░  11.01%
7        7        0.104153    █████████████████░░░░░░░░░░░░░░░░░░░░░░░  10.42%
8        8        0.062765    ███████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   6.28%
9        9        0.056616    ███████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   5.66%
-------------------------------------------------
PREDICTED CLASS: 5 (5) with 22.84% confidence
==================================================

As you would expect, not any conviction on this prediction, even going through the data with 10 epochs. Of course this was not a fair fight — I trained the MLP only on digits and asked it to find me a digit in a perfectly blank space.

How do we avoid this? We add in new classes for training: whitespace and Not a Number (NaN). When we train that way our results avoid hallucinations from being fed testing data that was outside the scope of the training data. We invoke the script now with classes d,ws for both training and testing:

./mnist_digits_10.py --train_classes d,w --train_samples 50000 --test_classes d,w --test_samples 1000 --epochs 10

Training on classes: ['d', 'w']
Testing on classes: ['d', 'w']
Proportional dataset created:
  digit: 45454 samples (90.9%)
  whitespace: 4545 samples (9.1%)
Loaded MNIST test dataset: 10000 samples
Loaded whitespace test dataset: 24000 samples
Proportional dataset created:
  digit: 909 samples (91.0%)
  whitespace: 90 samples (9.0%)
Using train_seed=42 for training data selection
Using test_seed=42 for test data selection
Using 49999 training samples and 999 test samples
Model created with 12 output classes
Starting training for 10 epochs
Epoch 1/10, Samples: 3200/49999, Loss: 1.4390
Epoch 1/10, Samples: 6400/49999, Loss: 0.5672
Epoch 1/10, Samples: 9600/49999, Loss: 0.3649
  ...
Epoch 10/10, Samples: 44800/49999, Loss: 0.0100
Epoch 10/10, Samples: 48000/49999, Loss: 0.0185
Training completed in 17.04 seconds
Average time per sample: 0.03 ms
Overall accuracy on 999 test images: 97.90%
Total Type I errors: 0 / 999 (0.00%)
Total Type II errors: 21 / 999 (2.10%)

Detailed breakdown:
  Class 0: 97.9% (94/96) correct, incorrect: 2
  Class 1: 100.0% (102/102) correct, incorrect: none
  Class 2: 98.9% (89/90) correct, incorrect: 1
  Class 3: 97.1% (101/104) correct, incorrect: 3
  Class 4: 98.1% (104/106) correct, incorrect: 2
  Class 5: 98.6% (71/72) correct, incorrect: 1
  Class 6: 98.6% (71/72) correct, incorrect: 1
  Class 7: 98.7% (75/76) correct, incorrect: 1
  Class 8: 93.3% (84/90) correct, incorrect: 6
  Class 9: 96.0% (97/101) correct, incorrect: 4
  Class ws: 100.0% (90/90) correct, incorrect: none

And beautifully we were fed 90 blank images and every time saw it as a blank. Perfect.

But look at the “8”s and “9”s – only 93.3% and 96% accurate there.

But this whole exercise got me thinking:

Is it possible to avoid hallucinations by training to avoid hallucinations?

Stated another way, “Is it possible to get better accuracy identifying 0-9 digits (using the same amount of computational power) if you train on digits *and* whitespace *and* non-numbers?”

Our end goal is to avoid a digit to digit hallucination. We don’t want to be presented with a 4 and say it is a 9. That is a Type II error, and what we want to avoid at all costs. Let’s look at standard training with just digits (using backslashes for readability):

./mnist_digits_10.py \
  --train_classes d \
  --train_samples 50000 \
  --test_classes d \
  --test_samples 1000 \
  --epochs 10 \
  --visualize 5 \
  --test_seed 19

Total Type II errors: 19 / 1000 (1.90%)

Let’s look at one of the 19 failed OCR attempts, a 4 that was misread as a 9.

  Class 4: 98.0% (96/98) correct, incorrect: 1 (9), 1 (7)

Note this particular digit error is a very bad 4 (half of the MNIST digits were written by high schoolers). However, it is labeled data, so we know without a doubt it is truly a 4.

Note our total Type II errors are at 1.9% — now, if we give the exact same testing data (including this bad “4”) but train now on 45000 digits plus 5000 not-a-number, do we get better results for digit-to-digit hallucinations? What do we predict for this “4”?

./mnist_digits_10.py \
  --train_classes d,nan \
  --train_samples 50000 \
  --test_classes d \
  --test_samples 1000 \
  --epochs 10 \
  --visualize 5 \
  --test_seed 19

Total Type I errors: 1 / 1000 (0.10%)
Total Type II errors: 19 / 1000 (1.90%)
  Class 4: 100.0% (98/98) correct, incorrect: none

So no better, no worse. 1.9% to 1.9%. Although the percentage remains the same, the individual errors are different. For example, our lousy “4” is now predicted properly (2nd of these 5 below):

but other errors come up including a single Type I error where we were given a digit and predicted it was not a number.

Let’s try this with 40,000 labeled digits, 5,000 not-a-number, and 5,000 blanks. Here is fuller script output :

./mnist_digits_10.py \
  --train_classes d,nan,w \
  --train_samples 50000 \
  --test_classes d \
  --test_samples 1000 \
  --epochs 10 \
  --visualize 5 \
  --test_seed 19

Training on classes: ['d', 'nan', 'w']
Testing on classes: ['d']
Reported testing statistics will exclude impossible inferences
Visualization options: ['5']
Using device: mps
EMNIST letters: 80000 total, 52301 after excluding C,L,N,O,Q,R,S,W,Z,c,l,n,o,q,r,s,w,z
Loaded EMNIST letters dataset: 52301 samples
Proportional dataset created:
  digit: 41666 samples (83.3%)
  letter: 4166 samples (8.3%)
  whitespace: 4166 samples (8.3%)
Loaded MNIST test dataset: 10000 samples
Using train_seed=42 for training data selection
Using test_seed=19 for test data selection
Using 49998 training samples and 1000 test samples
Model created with 12 output classes
Starting training for 10 epochs
Epoch 1/10, Samples: 3200/49998, Loss: 1.4763
Epoch 1/10, Samples: 6400/49998, Loss: 0.6118
Epoch 1/10, Samples: 9600/49998, Loss: 0.4315
   ... 
Overall accuracy on 1000 test images: 98.00%
Total Type I errors: 2 / 1000 (0.20%)
Total Type II errors: 20 / 1000 (2.00%)

So, no. Went from 1.9% error rate to 2.0%.

After much testing, in general, unfortunately no, you can not get better results against hallucinations by training for hallucinations. It seems non-intuitive, but all that training compute is essentially wasted. The testing digits are always 0-9 and although you have learned blank and NaN, you are never presented those during testing. This holds to large epochs and smaller batch sizes. On Average, you do 0.3% worse on accuracy when you train with whitespaces, 0.15% worse when you train with not-a-number, and 0.2% worse when you train with both whitespaces and not-a-number. Boo. 😞

If you do want to avoid type II errors at all costs, the better way is simply rejecting all inferences where the confidence is less than some high percentage, say 99.95%. That gets to 100% across the entire MNIST test set of 10,000 accuracy even with five epochs. It is surprising to me that it is not confidence of 90% at just 2 epochs, but there are some really badly written digits in there, for example:

Rejecting low-confidence inferences is much more effective at avoiding hallucinations vs training for the unexpected.

Thanks for reading! And please, play with the code yourself — I published the code on GitHub.

==================================================

Appendix: Oh — we never dove back into the hidden layers. Here is the full 3 layer neural network.

The point of learning is to change the weights in the 3 layers (fc1, fc2, fc3) to minimize the error. Pytorch randomly selects initial weights and biases (example: in the range of (-0.09,0.09) for fc1) and then adjusts weights with each batch of trained images. Here is what our feedforward multi-layer perceptron (MLP) neural network looks like:

The input layer 784 long. Hidden layer 1 is 256 neurons. Hidden layer 2 is 128 neurons. Layer 3 is the output layer with 10 output neurons. Total of about 400 neurons. Fully connected, about 235,000 synapses.

If you were to count each input pixel input as neurons then their are 1,200 neurons. However, the input layer is just that and not neurons. Analogous to your eyes being rods and cones connected to neurons in the brain– the rods and cones themselves are not neurons. Also the input layer is not fed in as 0-255 (based on the greyscale intensity), it is fed in between 0-1 (that is raw intensity of that pixel value (input logit?) divided by 255).

Really, neurons are just matrix math. That first hidden neuron in blue (h11) is simply the sum of each weight times each input summed, plus it’s bias

The ReLU just means that if the sum is positive, it stays. If the sum is negative, then h11=0. It stands for Rectified Linear Unit. Just doing this matrix math over and over again is what GPUs (and now we think maybe the human brain) does. Once it is learned, then the learned features (digits) are the AI black box magic inside the weights and biases.

Results of an 18 year experiment on college savings

My parents (God rest their souls) always were helpful with our 4 kids and money for college. When Addison and Emerson were born, they started a 529 savings account for each child and contributed $100 every month. When the girls were two years old I matched that by contributing an additional $101 every month. We continued that until my parents moved in 2022. So, we averaged about $201 in every month, or about $2,500 in each year for ~15 years. I took out ~$20,000 from each account from 2018-2024 for my graduate school and Evan’s schooling. So how much has that ~$13,000 in principle grown to now they are both 18 and heading out for college next Fall?

Well, when Addison and Emerson were still preschoolers I decided I would run an 18 year experiment. I would invest all Addison’s account in an SP500 fund, and Emerson’s I would invest 50% in SP500 fund and 50% in the age based fund, the kind that are more aggressive when the kid is younger, and more conservative as the kid nears college.

Of course this was not to favor or handicap one child — as the account owner I can always shift money, so the expectation is mom and dad will pay for college – I was never intending to give Addison the ability to go to only to OCCC and Emerson to Vanderbilt if the market did bad, or the reverse if the market did good.

So, looking behind the covers, the fund the Oklahoma college savings plan (the only one I get a state tax deduction for) — now calls the “U.S. Equity Index Option”, which when you go to their page https://www.oklahoma529.com/investment/risk-based/us-equity-index-option/ says it is really 100% invested into symbol TIEIX – Nuveen Equity Index Fund. The benchmark for that fund is the Russell 3000 Index. I know the Russell 5000, but Russell 3000? Never heard of it. So how has that fund done compared to SPY? Well, pretty similar. Both market cap weighted indexes, just the SP500 is limited to the top 500, Russell3000 is 3,000. The top stock weighting of the SP500 is NVDA at 7.96%, where the top of the R3000 is also NVDA at 6.78%.

To get performance, however, you must cite your source of truth. Even something straightforward like SPY, well look at these different reported results for 5 year SPY performance. Let’s look at State Street (the funds owner), Google finance, yahoo finance, and SeekingAlpha for $10k invested 5 years ago:

State Street$19,685
Google Finance$19,513
Yahoo Finance (chart)$20,336
Seeking Alpha$20,341

so why the difference? Dividends? let’s use yahoo finance daily historicals:

adjusted close of 307.83 to today’s price of 682.80 is 121.96% or $10k invested is now $22,196. So no, dividends were not even included. The non adjusted price of 330.20 to todays 682.80 is 106.9% or $10k is 20,692. Here is the updated chart:

Source of Truthcurrent value of $10k invested in SPY 5 years ago
State Street$19,685
Google Finance$19,513
Yahoo Finance (chart)$20,336
Seeking Alpha$20,341
Yahoo Finance (historical, adjusted close)$22,196
Yahoo Finance (historical, actual close)$20,692
Current value of $10k invested in SPY 5 years ago, per each source

This is a kinda big deal. The difference from $22,196 to $19,513 is 14%. Heck, without dividends, the difference from $20,692 to $19,513 is still 6%. 6% is two years on a cash account.

OK, set aside for the moment that management fees suck and the industry feeds off you like mosquitos feed off water buffalo. I guess we just accept it.

Back to the initial question — how did the 2 portfolios do? Well, here are the values today (Nov 3, 2025).

So $20k more, just by choosing full SP500 over 50/50 SP500/Age fund. Heck if I had just chose age fund for both, it would only be $40k each, not even enough to fund 1.5 years of school.

So now let’s look closer — I chose the Oklahoma fund and the investement choices they provide — but what if I owned my own? Let’s look at several scenarios:

  1. US Equity Fund, from OK4Savings.org
  2. 2024/2025 Enrollment Option, also from OK4Savings.org
  3. TIEIX
  4. SPY
  5. QQQ
  6. TQQQ
  • 1) As a baseline, here are the contributions to Addison’s account and total value now:

$13k net principle becomes $80k now. Good, not great.

2) OK, some will say I was “too aggressive” with 100% US Equity Index, and I should have done the 2024/2025 Enrollment option. Look here:

Would take $13k to $21k. Awful. Note, there were a lot of band rolls as she got older and automatic switching of funds prior to 2020. That’s why the year end prices look so rigid, but this is directionally correct.

Lesson 1 — Screw the people who tell you to invest concurrent with the kids age. It is just plain bad advice.

I know that the market is at all time highs now so it is easy to draw up lesson #1, but I strongly feel it is the correct lesson. If the market tumbles anyways then schools tuition and fees would go down commensurate. Don’t spend your life (and investments) worrying. It will kill you in the end, just like the unworthy servant who hid his talent in the ground. After all, that servant did safeguard and return the masters talent in perfect condition. What did it get him? He got called “wicked and slothful” by Jesus himself. Do you really want God to look at you and call you wicked or slothful? If not, heed lesson 1.

3) OK, now what if instead of TIAA/CREF, I just opened a Schwab account and invested in TIEIX directly?

Wow, that is a hell of a difference. $110k instead of $80k. Sure, you don’t get the state tax deduction, but that was not worth $30k. Again, financial management fees suck and the industry feeds off you like mosquitos feed off water buffalo.

4) OK, what if I invested in SPY instead of TIEIX in that Schwab account:

Basically the same as TIEIX, actually $5k worse. But basically the same.

5) Let’s get riskier: How about QQQ? The girls were born and this fund what set up the same time the iPhone 1 came out. Betting on tech (of course hindsight is 20/20) would have been a good thing:

Wow – $13k becomes $200k. Now you are talking. College is fully funded and mom and dad don’t have to dip into savings.

Lesson 2 — Invest risky, and keep the investment on no matter what (COVID, etc)

If you have belief in the future of the US, Lesson 2 is the only lesson you need to remember. As long as we are 4% of the world but 33% of the world’s economy, the reserve currency of the world, the strongest military in the world, etc — bet on the US for the very long term. We have a US exceptionalism and optimism that does not exist in other countries. Hell, in western Europe they are damn apologists. No one want to buy into that thinking.

6) OK, Final risk up trade, right up there with putting it in bitcoin. How about if I did TQQQ (leveraged ETF). The fund did not exist until 2010, so just take the principle inflows from 2007-2009 and dump it in the trash can. What do we get?

That’s right $1.2 million! And not going from $13k to $1.2 million, more like $8k to $1.2 million, because this starts in 2010. Now we are talking!

In truth, I don’t think there is any way I would have been able to keep this level of risk tolerance on. There is no way I would not sell during COIVD or any other number of days (like even today, just a 1% down day). But if I had – wow — I wish I had is all I can say.

Lesson 3 — all the savings, all the austerity – delay that purchase, get the cheaper model, it may all be worthless noise. Just click a few correct buttons a few times in your life >> all else.

Lesson 3 is kind of depressing, actually, but real and true. Sure, I wish I had twenty bitcoin lying around — I set up my first bitcoin full node when each coin was $100. Coulda / Woulda / Shoulda. My dad used to tell me he should have bought Boeing in the 70s and he would be passing $10 million to me. He didn’t and I didn’t. Oh well, I loved my dad very much, and that was worth a lot more than $10 million, and I am being very honest about it.

Still though, my hope is people reading this will have both — ample funds for all your dreams, and long and healthy lives with family. Thanks for reading!

Could you have made money in the stock market today?

I am curious for myself. The market opened about 1.75% down, quickly went to down 2.5% before 9am, and rallied all the way to closing just down 0.15% on the day, so a rally of ~1.5% from open to close. Could I have made significant money with $NDXP options today? My gut tells me no, but I want to see.

Specifically the most likely trade I would have done is buy $NDXP call options at the open for $NDX to close flat by the end of the day. Those obviously would have expired worthless as NDX did not end the day positive. But how about some others? The VIX is currently at 21, and my gut feel is it is too high to make money buying out of the money calls, but let’s see.

First, what if I had bought at the open with a strike 0.5% down from yesterday. NDX closed Friday at 19,280. 0.5% down is 19,180. NDX opened today at 18,990. The first trade for .NDXP-25-03-31-C-19,180 was at $19. So, you would have made $100 / $19 = about 5x your money. Damn. That is a lot.

OK, so how about if you bought .NDXP-25-03-31-C-19,200 at 8:48am when the market was down the full 2.5%. You could have picked it up for $9. It would have ended at $80. Almost 10X. Damn.

What about an ATM call? You could have bought .NDXP-25-03-31-C-19,000 at 8:30am for $100, and at 8:45am for $40. That would have closed at $280. Meaning 3X or 8X your money.

Damn, Well, I guess you could have made money today.

Just how good are the 2025 OKC Thunder?

OK, I’ll start this one off by admitting this post is total procrastination. It is Friday morning and I should be doing something productive, but instead I want to look at the metrics for the 2025 OKC Thunder. The stock market is down another 2% today so don’t look there. My curiosity was peaked when on local sports radio I heard the announcer say that the game last week between OKC and the LA Clippers was the first one possession game OKC has played all year. It is late March. So, let’s dig through the numbers:

As I write this the Thunder are 61-12 with 9 regular season games left to go. They have already wrapped up the #1 seed in the Western conference, no other conference team has even locked up a playoff spot yet (!!) In 1-score games this year (defined as -3 >= final margin >= +3 and OT games) the Thunder are 1-4. So the radio talking head was wrong, it was not the Thunder’s first 1-possession game, it was just the 1st 1-possession game the Thunder have won. The Thunder are 6-2 in 2-possession games, and 54-6 in 3+ possession games. Here is the full record:

2025 Thunder record to date

The record is sorted by point differential. OT games are boxed in, 1-possession games are in puke-yellow, and 2-possession games are in sea-foam-green. 3-possession+ games are in white.

The Thunder play a one-possession game about once a month. That is nuts. Compare the current champs, the Boston Celtics:

The Celtics have played twice as many 1-possession games (10) and won 7 of those.

So if you convert the Thunder’s 1-4 record in 1 possession games to 4-1 (or even better 5-0) then they would be 65-8 currently, and theoretically could win out to go 74-8. Only slightly crazy talk, because look at the 2016 Golden State Warriors:

Notice their good fortune in close games — they went 10-0 in games that were decided by 2-possessions. They even went 7-2 in 1 possession games, for a total of 17-2. Their 3+ possession record of 56-7 is going to end up being worse than the thunders, who could go as high as 63-6

It has been widely reported this year that OKCs average margin of victory (currently +13.1) is the largest in NBA history (all time). It handily beats the 2nd and 3rd best teams this year (Cleveland +10.4 and Boston +9.1). To put in perspective that those two are great numbers on their own, the 4th and 5th place teams this year are in the +4 range. It even handily beats Jordan’s 1995 Bulls (+12.3) and the 2015 GSW team (+10.8) that went 73-9. The current best all-time is the 71-21 Lakers at +13.9. But that need to be adjusted for pace — back in the 1970s there were many possessions per 48 minutes. No 3-point line, no offensive sets, just fast breaks and dunks (Showtime!). The metric that does this is Net Rating. Net rating adjusts for pace, making it possible to compare teams that play at different speeds. For example, a fast-paced team might have a good point differential simply because they play more possessions per game, while net rating would reveal if they’re actually more efficient on a per-possession basis. Here are the net-rating comparisons:

So, on a Net Rating basis, OKC is almost up +3 on 2016 GSW and even better than Jordan and the Bulls.

Neville’s Take:

So just how good are these Thunder? Let me make a prediction. The Thunder will be the first team to go 16-0 in the playoffs. It is very likely they sweep anyone in the West, and then the ECF will have Cleveland and Boston slug it out, the winner there being tired and no match for a rested, healthy Thunder. If you are in Vegas put some money on that prop bet and send a check my way for Father’s day. We all need it after the stock market today 🙂

Book review – The Frackers, by Gregory Zuckerman

Just finished a great read recommended to me by Michael Palmer — The Frackers largely centers around the events in the American shale oil boom from ~2000 till about ~2014 when the book was written and published. The cast of characters is nothing less than American Heroes- George Mitchell, Harold Hamm, Aubrey McClendon/Tom Ward, Charif Souki. Several of these heroes were bankrupted or even dead and the impact they have had on our way of life is not appreciated by as many as should.

To put it in perspective, the USA (including Alaska) has around 3-4% of the worlds proven oil reserves (around 50,000 million barrels, with a world supply of 1,500,000 million barrels). However, we are the #1 producer in the world, pumping 15% of the worlds supply (13 million barrels each day, where the world pumps 83 million barrels). In 2005 we pumped only about 5 million barrels per day, with many assuming domestic oil would run out, but the work of these people has pushed us from 5 million to 13 million. As a consequence our gas and electricity bills are less than half of western Europe — natural gas in Europe costs $10 per million BTUs, gas in Asia is about $12 per million, in the USA it is just $2-3 per million. Natural gas (produced alongside oil frequently) is also much cleaner burning than coal. If not for these men we we would be burning 2-3x as much coal, polluting the environment, and paying 3x for the ability to do so.

George Mitchell, who developed the Woodlands area north of Houston, started commercial development of horizontal drilling and fracking in the 1980s / 1990s. Oil drilling previous to Mitchell was basically drill a vertical hole in the earth like a big straw and pump it out. Most fields (like Saudi Arabia’s easy oil) are just sitting there in a giant pool. This domestic revolution was shale oil — liquid oil that is there, but trapped inside rock. It takes guts to drill down 2 miles into rock, turn that horizontal, drill another 3 miles, send dynamite explosive charges down with water and blow those rocks up to recover oil. You can see how it is much easier to just drill it in the middle east and pay the importers.

Aubrey McClendon and Tom Ward (via Chesapeake Energy) really supersized the process and embraced debt to expand operations. Aubrey in particular is someone who should be taught in OKC metro public schools as he brought forth Classen Curve, transformed the city with the olympic rowing river south of downtown, helped bring the Thunder to OKC– really changed the fate of OKC for the better. Sadly though, Obama could not have given a flip about any of this and Obama’s DOJ witch hunted him because he lived on the edge with debt and largess, so they indicted him with jail time in mind, and it was too much for Aubrey as his distracted mind was killed in a car crash 24 hours later. This is after Aubrey made many, many land owners very rich by paying billions of dollars for mineral rights. He employed more landmen then others had employees. Shame on our government at that time.

Charis Souki is super fascinating. He actually managed the restaurant in LA where OJ/Nichole Brown/Ron Goldman happened. He decided to leave LA after that and move to Louisiana and get involved in oil. Specifically he saw all the media reports that the USA was running out of oil and decided to build multibillion dollar import terminals for liquified natural gas drilled in Europe and then imported into America. At that time both the USA and rest of the world were about $2-3 per million BTU, and he foresaw a time where the rest of the world would stay at $3 and the USA would go to $10. Well, as it turned out because of this domestic shale oil boom and Russia/Ukraine the USA is at $2-3 and Europe is at $10. He reconfigured his company midstream to go from importing natural gas to exporting it, and now LNG trades at $230 per share, up 20x since 2010).

Anyways, a great read and the author is on X at @GZuckerman I love a good nonfiction story, and all Oklahomans should know this story.

AI has a photographic memory

I had a lightbulb moment today. I am talking a class on neural networks taught by the excellent Dr. G. at Stanford continuing education. Last lecture we talked about a simple neural network identifying an image, say a boat/plane/car/train. The neural net starts blank, and you feed it labeled images of boats/planes/etc. That input changes the weights of the perceptrons (neuron mimicking structures in a machine). These weights are simple numbers, think 4, 7, 12.5, whatever. The point is simple numbers (weights) only. These perceptrons connect to each other and have an activation function, so a 12.5 from one perceptron is fed to perceptron #2 and the 2nd perceptron may (or may not) fire a number downstream after being fed a 12.5. That’s it. After trained on numerous boats/planes/cars/trains, if you feed the network a new boat it has not seen before it is likely to spit out “boat” because this new image fed a 12.6 to the downstream perceptrons, not exactly 12.5, but much closer than plane or car.

The key point to understand in the paragraph above is the AI (specifically large language models) do not “store” source materials. There is no hard drive with images of boats that can be pulled up. The network has seen many boats and that has caused these weights to be as they are. The only memory are these numbers and weights, not source material — words or images. That bears repeating- if I have a model like gemma-2-27b that is 50GB large, those 50GB are all model weights — absolutely no supplemental material.

Think about your physics test back in college– your teacher allowed you to write anything you wanted, formulas, pictures on a 3×5 note card, and as long as you could fit it on that note card you could bring it in during test time. So your brain had the ideas and methods, but you had a note card to remember the exact derivation of final speed based on acceleration and time. What I am trying to say is that the AI language model has no note card. It does not have 50GB of weights and also the text of the Declaration of Independence, it just has 50GB of weights. Sure it has read (been trained on) the Declaration of Independence, but when I ask Grok/Claude/ChatGPT what is the 3rd line of the Declaration of Independence it *does not* pull up the internet, read the text, then tell me the answer — it simply pulls the answer out of those 50Gb of weights. (now this is not exactly true anymore, Grok and the other LLMs can search the internet and take in results, but a traditional old-school LLM like gemma-2-27b does not need, and can not use, any internet access whatsoever)

So in these 50Gb of weights (not really that big, about the size of 10 movies) it can think (or predict) words out of the Declaration of Independence. Or the emancipation proclamation.

So I asked Ara (the xAI voice assistant) to read me word for word the emancipation proclamation. It said that from my 50Gb of weights I can give you that it is 270 words long, 5 paragraphs and it could give me the gist of each section, but it probably could not recite it word for word. I pulled up Lincoln’s handwritten version from the National Archives and read along as I asked Grok to give it to me word for word, or try its best. It nailed EVERY SINGLE WORD! All from the 50Gb of weights. I even asked it to tell me about which exceptions Lincoln wrote in inside the margins, where the line spacing is off. This is a very obscure reference. If you do a google search for ’emancipation proclamation “Norfolk” and “Portsmouth” “line spacing”‘ you will not get any results. This is just something you have to read and look at. But Grok, after successfully reading me the whole thing (again from “memory” aka the 50Gb of model weights) correctly told me the exceptions for Norfolk and Portsmouth were written in between the normal line spacing.

So the lightbulb for me? An LLM is not just smart — it has a photographic memory. It does not have to recall source material on demand, it can pull EXACT COPIES of things just from its weights. Maybe today only 270 words like the Emancipation Proclamation, but tomorrow, everything.

AI is freaking amazing at coding

So I know anyone who codes is underwhelmed by that post title. Of course it is and we all have known that for some time. But how do I convey that to people who are non-programmers? I found myself a couple weeks ago talking to a person at Cisco and saying that AI tools like ChatGPT are incredible at understanding my intent in code and helping me out, but I felt lacking in making a concrete example that connects to people who don’t live in arrays, lists, and hashes.

Well, today I have an easy low-hanging fruit example to share. I was updating some code on my playoffpredictor.com site where conferences were hard coded in:

$conference = array(

"Air Force" => "G5",
"Akron" => "G5",
"Alabama" => "SEC",
"Appalachian State" => "G5",
"Arizona" => "B12",

I asked Claude.ai to help me pull conference data from a file I have (fbsTeams.json) which has the data in more traditional .json format:

[
 {
"team_name": "Boston College",
"conference": "ACC",
"cbs_sports_icon": "BC",
"cfdb_name": "Boston College",
"pp_id": "1"
 },
 {
"team_name": "Clemson",
"conference": "ACC",
"cbs_sports_icon": "CLEM",
"cfdb_name": "Clemson",
"pp_id": "2"
 },
 {
"team_name": "Duke",
"conference": "ACC",
"cbs_sports_icon": "DUKE",
"cfdb_name": "Duke",
"pp_id": "3"
 },

AI of course wrote the necessary code for me. Here was the prompt I used:

I want to replace the current hard-coded way of defining $conference and instead move it to reading the .json file and populating it from the file. please write the necessary code for me. The file path is /home/neville/cfbPlayoffPredictor/data/2024/fbsTeams.json.

For teams that pull as conference CUSA, Sun Belt, Mountain West, AAC, FBS Independent, I want those all to map to "G5" in the $conference array. I want the logic to be that if the team does not match SEC,B10,B12, or ACC then replace whatever the .json file has with "G5"

And here is the AI code output:

Here is what I mean by intent — no where in the prompt did I say anything about power 5 conferences. But it understands that and wrote the code that way. Stupid tools would just call it ‘filtered_conferences’ or some basic thing like that, but AI is smart enough to understand I want to separate out power 5 conferences and G5 conferences, even though I never used the words power 5.

Of course in 2024 with the destruction of the Pac12 there are now only 4 power conferences, so I edit the code to call it $power4 in my script. But the hard work is done by AI and I just do some fine tuning editing. This is a perfect example of how AI should be helping us in day-to-day mundane coding tasks.

2024 Week 2 playoff predictor ATS results

Week 2 is in the books and for Auburn it is better luck next year. Ouch. Oh well, let’s see what we can learn from the computer model on playoffpredictor.com

The model went .615 (29-18-1) which was tied for 2nd out of 37 computer models for the week. Outstanding! And remember, this is with no preseason data of any kind. Just the results of the games played on week 1. Some of the more predictable results were Oklahoma beating Houston by 4 and Penn State beating Bowling Green by 7, when the spreads were -27.5 and -34 respectively. The computer model said -18 and -10.5, which were significant improvements on Mean Squared Error. Speaking of Mean Squared Error, the model went +142 and Absolute error of 3.3, which were dead last and next to last respectively out of the 37 computers. This is to be expected as other computers use player, team and preseason data. The model predicts no blowouts this early in the season, although we know there will be blowouts in week 1-4.

I don’t like this 12 team playoff. After spending last week updating the logic for 12 teams instead of 4 the computer sees these probabilities for teams making the playoff after 2 weeks of data:

Note the top likelihood is Syracuse, due to an ease of schedule. It wont last. I’d be surprised to see them still on top of the ACC by week 4.

Right now it says SEC gets 3.5 teams, Big10 gets 3 teams, Big 12 gets 1.5 teams, ACC gets 2 teams, and G5 gets 2 teams. I’d expect by season end it will be SEC 4 teams, Big10 3 teams, Big 12 2 teams, ACC 2 teams, and G5 1 team. I think the talk and season end is who is the 12th team, a 8-4 Missouri or a 10-2 Utah. Ugh. Who cares. What a horrible debate to have.