Yesterday I wrote about Google’s Reformer in
Today I'm sharing some technical information how it is implemented in order to allow input size of 1M of tokes within 16GB of memory.
There is a Jupyter notebook that enlights some information, available at https://colab.research.google.com/drive/1MYxvC4RbKeDzY2lFfesN-CvPLKLk00CQ?usp=sharing.
Enjoy.
No comment