Notes · writing · research

Blog

Notes on hardware, compilers, and machine learning — and a record of the experiences along the way.

§ Latest writing
  • ·Linear Layouts

    Swizzle by Hand, Swizzle by Algebra

    First we derive a conflict-free shared-memory layout for a 16×32 transpose with nothing but intuition and bit-flips. Then we rebuild the very same answer with linear layouts over F₂ — and watch the hand-derivation turn into an algorithm that works for layouts no one could eyeball.

    Read
  • ·Linear Layouts

    One Bit Off

    The classic XOR swizzle still bank-conflicts a 16×32 transpose. The optimal-swizzling algorithm from the Linear Layouts paper finds the fix — and it differs from the textbook answer by a single bit shift. A worked example after Lei Zhang's linear-layout post.

    Read
  • ·Personal Essay

    On Being a Teaching Assistant

    Things I learned from my teaching assistant experience in ECE2300 SP24. On December 11th, 2023, at precisely 3:02 PM, I received a message from my advisor: “Can we chat a bit this week about the 2300 TAship, say Friday a…

    Read
  • ·Tutorial

    Programming Intel FPGAs on Apple Silicon Macs

    This tutorial presents how to set up a Quartus compilation flow with UTM on Apple Silicon Macs. IntroductionWhy would you want to program Intel FPGAs on an Apple Silicon Mac? Maybe you are a student/TA who needs to run s…

    Read
  • ·Conference Journal

    CVPR 2022 at New Orleans

    My experience attending and presenting at CVPR22 The IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) is a premier international conference held every year in the US. The 2022 CVPR is held in New Orlean…

    Read
  • ·Paper Reading

    Efficient Path Profiling

    A blog digest of the paper Efficient path profiling by Ball, Thomas, and James R. Larus. Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29. IEEE, 1996. Efficient Path Profilin…

    Read
  • ·Tutorial

    How to Set Up VNC Server on a Linux Machine

    A tutorial to setup VNC server on linux machines for remote GUI desktop access. First of all, what is VNC? In computing, Virtual Network Computing is a graphical desktop-sharing system that uses the Remote Frame Buffer p…

    Read
  • ·Research

    Introduction to SystemC

    SystemC is a system-level modeling language, often applied to high-level synthesis. Overview Loosely speaking, SystemC allows a user to write a set of C++ functions (processes) that are executed under control of a schedu…

    Read
  • ·Research

    Gumbel-Softmax

    Gumbel-Softmax is a reparameterization trick to make the sampling process from categorical distribution differentiable. Why are we interested in Gumbel-Softmax?Gumbel-Softmax makes categorical sampling differentiable. Wh…

    Read
  • ·Readings

    3D Scene Understanding

    Deep learning methods for 3D scene understanding, particularly focused unsupervised methods. 3D DatasetsData Representations Multiview images: multiple 2D images of the same object from different angles. Depth map Voxel:…

    Read
  • ·Readings

    Paper Readings

    A reading diary to keep track of papers that I read. Virtualizing FPGAs in the CloudAuthors: Yue Zha, Jing LiVenue: ASPLOS 20Institution: University of Pennsylvania Point-Voxel CNN for Efficient 3D Deep LearningAuthors: …

    Read
  • ·Tutorial

    Loop Optimization in HLS

    Loop optimization in Vivado HLS. What is II?II means initiation interval. For a function, II is the number of clock cycles before it could accept new inputs and is generally the most critical performance metric in any sy…

    Read