Jekyll2022-04-14T18:12:40+09:00https://dongkwan-kim.github.io/feed.xmlDongkwan Kimpersonal descriptionDongkwan Kimdongkwan.kim@kaist.ac.krThree Basic Tips for PyTorch Beginners2021-06-09T00:00:00+09:002021-06-09T00:00:00+09:00https://dongkwan-kim.github.io/blogs/tips-for-pytorch-beginners<p>While grading students’ codes this semester (Fall 2021), I found some suboptimal patterns that students often use. This article organizes them and introduces a more efficient use of PyTorch.</p> <h2 id="avoid-to-use-your-own-loops-use-pytorchs-functions">Avoid to use your own loops, use PyTorch’s functions</h2> <p>Let’s try to find the maximum of the tensor corresponding to the first dimension.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">torch</span> <span class="n">x</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">Tensor</span><span class="p">([[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span> <span class="p">[</span><span class="mi">7</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">4</span><span class="p">]])</span> <span class="n">answer</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">Tensor</span><span class="p">([</span><span class="mf">3.</span><span class="p">,</span> <span class="mf">7.</span><span class="p">])</span> </code></pre></div></div> <p>Iteration over a tensor through primitive loops (<code class="language-plaintext highlighter-rouge">for</code> or <code class="language-plaintext highlighter-rouge">while</code>) in python is very slow.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Don't </span><span class="n">max_x</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">empty</span><span class="p">((</span><span class="mi">2</span><span class="p">,))</span> <span class="n">idx</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">dim</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">x</span><span class="p">.</span><span class="n">size</span><span class="p">(</span><span class="mi">0</span><span class="p">)):</span> <span class="n">max_x</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">x</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">idx</span><span class="p">[</span><span class="n">i</span><span class="p">]]</span> </code></pre></div></div> <p>Instead, use methods implemented in PyTorch.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Do </span><span class="n">max_x</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="nb">max</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">dim</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span> </code></pre></div></div> <p>It is nearly impossible to remember all functions in PyTorch. We may not know which functions are implemented or which functions to use. Thus, it is important to search the document first.</p> <h2 id="use--in-slicing-tensors">Use <code class="language-plaintext highlighter-rouge">:</code> in slicing tensors</h2> <p>We probably need to select the entire sub-tensor for some dimension. For this case, I have seen using <code class="language-plaintext highlighter-rouge">torch.arange</code> with the corresponding size.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># H: Tensor, the shape of which is [B, N, F]. # Don't </span><span class="n">H</span> <span class="o">=</span> <span class="n">H</span><span class="p">[</span><span class="n">torch</span><span class="p">.</span><span class="n">arange</span><span class="p">(</span><span class="n">H</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]),</span> <span class="n">idx</span><span class="p">]</span> </code></pre></div></div> <p>This can easily be replaced with a colon (<code class="language-plaintext highlighter-rouge">:</code>) .</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Do </span><span class="n">H</span> <span class="o">=</span> <span class="n">H</span><span class="p">[:,</span> <span class="n">idx</span><span class="p">]</span> </code></pre></div></div> <p>If we put colons in the entire dimension of the Tensor, we can easily recognize its shape. This improves the readability of the code and makes it easier to maintain it.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Even better </span><span class="n">H</span> <span class="o">=</span> <span class="n">H</span><span class="p">[:,</span> <span class="n">idx</span><span class="p">,</span> <span class="p">:]</span> </code></pre></div></div> <h2 id="avoid-to-call-unnecessary-detach">Avoid to call unnecessary <code class="language-plaintext highlighter-rouge">.detach()</code></h2> <p>Detaching a tensor from a computational graph (by <code class="language-plaintext highlighter-rouge">.detach()</code>) is usually not a good idea. This prevents propagating the gradient to the graph before that computational node.</p> <p>In the code below, let’s detach <code class="language-plaintext highlighter-rouge">hidden</code>, the output of <code class="language-plaintext highlighter-rouge">layer_1</code>.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">torch</span> <span class="n">torch</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">manual_seed</span><span class="p">(</span><span class="mi">42</span><span class="p">)</span> <span class="n">layer_1</span><span class="p">,</span> <span class="n">layer_2</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">nn</span><span class="p">.</span><span class="n">Linear</span><span class="p">(</span><span class="mi">16</span><span class="p">,</span> <span class="mi">16</span><span class="p">),</span> <span class="n">torch</span><span class="p">.</span><span class="n">nn</span><span class="p">.</span><span class="n">Linear</span><span class="p">(</span><span class="mi">16</span><span class="p">,</span> <span class="mi">16</span><span class="p">)</span> <span class="n">data</span><span class="p">,</span> <span class="n">labels</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">rand</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">16</span><span class="p">),</span> <span class="n">torch</span><span class="p">.</span><span class="n">rand</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">16</span><span class="p">)</span> <span class="n">hidden</span> <span class="o">=</span> <span class="n">layer_1</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="c1"># Don't </span><span class="n">hidden</span> <span class="o">=</span> <span class="n">hidden</span><span class="p">.</span><span class="n">detach</span><span class="p">()</span> <span class="n">output</span> <span class="o">=</span> <span class="n">layer_2</span><span class="p">(</span><span class="n">hidden</span><span class="p">)</span> <span class="p">(</span><span class="n">output</span> <span class="o">-</span> <span class="n">labels</span><span class="p">).</span><span class="nb">sum</span><span class="p">().</span><span class="n">backward</span><span class="p">()</span> <span class="c1"># MSE loss </span><span class="n">optim</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">optim</span><span class="p">.</span><span class="n">SGD</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="n">layer_1</span><span class="p">.</span><span class="n">parameters</span><span class="p">())</span> <span class="o">+</span> <span class="nb">list</span><span class="p">(</span><span class="n">layer_2</span><span class="p">.</span><span class="n">parameters</span><span class="p">()),</span> <span class="n">lr</span><span class="o">=</span><span class="mf">1e-1</span><span class="p">)</span> <span class="k">print</span><span class="p">(</span><span class="n">layer_1</span><span class="p">.</span><span class="n">weight</span><span class="p">.</span><span class="n">mean</span><span class="p">())</span> <span class="n">optim</span><span class="p">.</span><span class="n">step</span><span class="p">()</span> <span class="k">print</span><span class="p">(</span><span class="n">layer_1</span><span class="p">.</span><span class="n">weight</span><span class="p">.</span><span class="n">mean</span><span class="p">())</span> </code></pre></div></div> <p>Then, the parameter of <code class="language-plaintext highlighter-rouge">layer_1</code> does not change even after <code class="language-plaintext highlighter-rouge">.step()</code>. In most cases, this is not the result we want.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tensor(-0.0136, grad_fn=&lt;MeanBackward0&gt;) tensor(-0.0136, grad_fn=&lt;MeanBackward0&gt;) </code></pre></div></div> <p>If we remove the <code class="language-plaintext highlighter-rouge">.detach()</code> part, we can see that the <code class="language-plaintext highlighter-rouge">layer_1</code> has been updated.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tensor(-0.0136, grad_fn=&lt;MeanBackward0&gt;` tensor(0.0113, grad_fn=&lt;MeanBackward0&gt;) </code></pre></div></div> <p>Of course, there is also an <a href="https://pytorch.org/docs/stable/nn.functional.html#gumbel-softmax">advanced way</a> of using <code class="language-plaintext highlighter-rouge">.detach()</code> on purpose.</p> <h2 id="related-materials">Related materials</h2> <ul> <li><a href="https://towardsdatascience.com/7-tips-for-squeezing-maximum-performance-from-pytorch-ca4a40951259">7 Tips To Maximize PyTorch Performance</a></li> <li><a href="https://towardsdatascience.com/efficient-pytorch-part-1-fe40ed5db76c">Efficient PyTorch — Eliminating Bottlenecks</a></li> </ul>Dongkwan Kimdongkwan.kim@kaist.ac.krWhile grading students’ codes this semester (Fall 2021), I found some suboptimal patterns that students often use. This article organizes them and introduces a more efficient use of PyTorch.Use a Docker Container like a Remote Server2021-04-06T00:00:00+09:002021-04-06T00:00:00+09:00https://dongkwan-kim.github.io/blogs/use-a-docker-container-like-a-remote-server<p>This post describes how to use a Docker container like a single isolated remote server. Some of the content focuses on using NVIDIA GPUs, so you can omit the details if you are not interested in this.</p> <h2 id="preliminary">Preliminary</h2> <ul> <li>In this article, your server will be a remote machine (or host machine) and your laptop will be a local machine.</li> <li>You have to install the Docker first in your remote machine. (Or ask the root to do this).</li> <li>You do not have to be a root of the host machine to follow this post.</li> </ul> <h2 id="instruction">Instruction</h2> <p>Pull an image you want to use in your remote machine. In this post, it is <code class="language-plaintext highlighter-rouge">nvidia/cuda:10.2-cudnn8-devel-ubuntu18.0</code>.</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># In the remote machine,</span> docker pull nvidia/cuda:10.2-cudnn8-devel-ubuntu18.04 </code></pre></div></div> <p>Create a container based on this image.</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># In the remote machine,</span> docker run <span class="nt">-ti</span> <span class="se">\</span> <span class="nt">--runtime</span><span class="o">=</span>nvidia <span class="se">\</span> <span class="nt">--name</span> <span class="nv">$container_name</span> <span class="se">\</span> <span class="nt">-p</span> <span class="nv">$ssh_port</span>:22 <span class="nt">-p</span> <span class="nv">$tensorboard_or_jupyter_port</span>:6006 <span class="se">\</span> <span class="nt">-v</span> <span class="nv">$mount_dir</span>:<span class="nv">$mount_dir</span> <span class="se">\</span> <span class="nt">--ipc</span><span class="o">=</span>host <span class="se">\</span> <span class="nt">-d</span> nvidia/cuda:10.2-cudnn8-devel-ubuntu18.04 /bin/bash </code></pre></div></div> <ul> <li>If you want to use GPUs in the container, you have to specify <code class="language-plaintext highlighter-rouge">--runtime=nvidia</code>.</li> <li>You have to forward 22 port (ssh) to your own port.</li> <li>If you want to use a Tensorboard or Jupyter, you have to forward additional ports for these.</li> <li>If you want to use file systems in the host machine, you have to bind mount a volume (<code class="language-plaintext highlighter-rouge">-v$mount_dir:$mount_dir</code>) using absolute paths.</li> <li>If you want to use multiple subprocesses (<code class="language-plaintext highlighter-rouge">num_workers</code> &gt; 1) in <code class="language-plaintext highlighter-rouge">DataLoader</code>, it is recommend to put <code class="language-plaintext highlighter-rouge">--ipc=host</code>.</li> <li>If you want to use a specific set of GPUs, use <code class="language-plaintext highlighter-rouge">--gpus</code> argument. (e.g., <code class="language-plaintext highlighter-rouge">--gpus '"device=0,1,2,3"'</code>).</li> </ul> <p>Now attach your container.</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># In the remote machine,</span> docker attach <span class="nv">$container_name</span> </code></pre></div></div> <p>In the container, install whatever you need.</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># In the container of the remote machine,</span> apt-get update apt-get <span class="nb">install </span>vim git openssh-server ssh <span class="nb">sudo </span>curl screen software-properties-common add-apt-repository ppa:deadsnakes/ppa apt-get update apt-get <span class="nb">install </span>python3.7 python3-pip python3.7-dev </code></pre></div></div> <p>You might want to login the container with a password, then turn off <code class="language-plaintext highlighter-rouge">PubkeyAuthentication</code>.</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># In the container of the remote machine,</span> <span class="nb">sed</span> <span class="s1">'s/PubkeyAuthentication/#PubkeyAuthentication/g'</span> /etc/ssh/sshd_config <span class="o">&gt;</span> ~/sshd_config.tmp <span class="nb">cat</span> ~/sshd_config.tmp <span class="o">&gt;</span> /etc/ssh/sshd_config <span class="nb">rm</span> ~/sshd_config.tmp service ssh restart </code></pre></div></div> <p>Create a user account and make it a root.</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># In the container of the remote machine,</span> adduser <span class="nv">$user_name</span> usermod <span class="nt">-aG</span> <span class="nb">sudo</span> <span class="nv">$user_name</span> su <span class="nv">$user_name</span> </code></pre></div></div> <p>Configure user-specific settings: git global configuration, shell, bashrc/zshrc, or vimrc.</p> <p>Turn off the shell you are working on, and turn on a new shell in the local machine. Then access the container you created with the below command.</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># In the local machine,</span> ssh <span class="nt">-p</span> <span class="nv">$ssh_port</span> <span class="nv">$user_name</span>@<span class="nv">$host</span> </code></pre></div></div> <p>If you want to use a Tensorboard or Jupyter, please follow <a href="/blogs/tensorboard-in-a-docker-container/">my other post</a>.</p>Dongkwan Kimdongkwan.kim@kaist.ac.krThis post describes how to use a Docker container like a single isolated remote server. Some of the content focuses on using NVIDIA GPUs, so you can omit the details if you are not interested in this.A Short History of Positional Encoding2021-02-09T00:00:00+09:002021-02-09T00:00:00+09:00https://dongkwan-kim.github.io/blogs/a-short-history-of-positional-encoding<p>Since I first saw the ‘Attention Is All You Need’ paper, I had a strong curiosity about the principle and theory of positional encoding. It is well understood that the Transformer did not have inductive biases for RNN architectures and thus introduced positional encoding. However, I have still not convinced how and why this works. The authors mentioned that they chose this design because of the special nature of sinusoid about the relative position, but it was not enough for me.</p> <blockquote> <p>… we hypothesized it would allow the model to easily learn to attend by relative positions, since for any fixed offset $k$, $PE_{pos+k}$ can be represented as a linear function of $PE_{pos}$. (Section 3.5 in <a href="https://arxiv.org/abs/1706.03762">Vaswani et al., 2017</a>)</p> </blockquote> <p>While searching related literature, I was able to read the papers to develop more advanced positional encoding. In particular, I found that positional encoding in Transformer can be beautifully extended to represent the time (generalization to the continuous space) and positions in a graph (generalization to the irregular structure). In this post, I review the work related to positional encoding and describe what theories are based on the generalization to time and graph.</p> <h2 id="positional-representation">Positional Representation</h2> <h3 id="learned-positional-embedding">Learned Positional Embedding</h3> <p>Prior Transformers, <a href="https://arxiv.org/abs/1705.03122">Gehring et al., 2017 (ConvS2S)</a> replaces recurrent neural networks with convolutional neural networks for the sequence to sequence learning. It might be less effective than attention-only-modules, but convolution is able to exploit the parallelism of GPU hardware rather than recurrent units. Since the convolution operator only sees the sequence’s part, it can only learn the word orders within kernel size and not the whole context. That is why ConvS2S uses additional embedding to let the model know the input’s position.</p> <p>Its implementation is straightforward. Positional embedding in ConvS2S is a just learnable parameter with the same dimension of the word embedding.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># https://github.com/pytorch/fairseq/blob/master/fairseq/modules/positional_embedding.py#L25-L26 </span><span class="n">m</span> <span class="o">=</span> <span class="n">LearnedPositionalEmbedding</span><span class="p">(</span><span class="n">num_embeddings</span><span class="p">,</span> <span class="n">embedding_dim</span><span class="p">,</span> <span class="n">padding_idx</span><span class="p">)</span> <span class="n">nn</span><span class="p">.</span><span class="n">init</span><span class="p">.</span><span class="n">normal_</span><span class="p">(</span><span class="n">m</span><span class="p">.</span><span class="n">weight</span><span class="p">,</span> <span class="n">mean</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">std</span><span class="o">=</span><span class="n">embedding_dim</span> <span class="o">**</span> <span class="o">-</span><span class="mf">0.5</span><span class="p">)</span> </code></pre></div></div> <p><a href="https://arxiv.org/abs/1706.03762">Vaswani et al., 2017 (Transformer)</a> compares ConvS2S’ learned positional embedding and their sinusoidal embedding, and the performances are almost the same. It also argues that “sinusoidal version may allow the model to extrapolate to sequence lengths longer than the ones encountered during training”.</p> <h3 id="positional-encoding-with-sinusoids">Positional Encoding with Sinusoids</h3> <p>As we can see from the title, <a href="https://arxiv.org/abs/1706.03762">Attention Is All You Need</a>, Transformers fully replace the recurrent units with attention. Unlike the recurrent unit, the attention computation across tokens can be fully parallelized, that is, they do not have to wait for the calculation of the previous token’s representation to get the current token’s representation. However, in return for the grace of parallelization, Transformers gave up the inductive bias of recurrence that RNNs have. Without positional encoding, the Transformer is permutation-invariant as an operation on sets. For example, “Alice follows Bob” and “Bob follows Alice” are completely different sentences, but a Transformer without position information will produce the same representation. Therefore, the Transformer explicitly encodes the position information.</p> <p>Their proposed sinusoidal positional encoding is probably the most famous variant of positional encoding in transformer-like models. These are composed of sine and cosine values with position index as input.</p> \begin{aligned} PE_{(\text{pos},\ 2i)} &amp;=\sin \left(\operatorname{pos} / 10000^{2 i / d_{\text {model }}}\right) \\ PE_{(\text{pos},\ 2i+1)} &amp;=\cos \left(\operatorname{pos} / 10000^{2 i / d_{\text {model }}}\right) \end{aligned} <p>If we draw this equation, it looks like Figure 1.</p> <figure> <img src="/images/sinusoidal-pe.png" alt="Visualization of sinusoidal positional encoding." /> <figcaption> <b>Figure 1.</b> Visualization of sinusoidal positional encoding. </figcaption> <span class="figsource"> Source: <a href="https://www.tensorflow.org/tutorials/text/transformer?hl=en" target="_blank">TensorFlow tutorial: Transformer model for language understanding</a> </span> </figure> <h3 id="relative-positional-encoding">Relative Positional Encoding</h3> <p><a href="https://arxiv.org/abs/1803.02155">Shaw et al., 2018</a>, <a href="https://arxiv.org/abs/1901.02860">Dai et al., 2019 (Transformer-XL)</a></p> <h3 id="learning-to-encode-position-for-transformer-with-continuous-dynamical-model">Learning to Encode Position for Transformer with Continuous Dynamical Model</h3> <p><a href="http://proceedings.mlr.press/v119/liu20n.html">Liu et al., 2020 (FLOATER)</a></p> <h3 id="rethinking-positional-encoding">Rethinking Positional Encoding</h3> <p><a href="https://openreview.net/forum?id=09-528y2Fgf">Ke et al., 2021</a></p> <h2 id="generalization-beyond-position">Generalization Beyond Position</h2> <h3 id="time-representation">Time Representation</h3> <p><a href="https://arxiv.org/abs/1911.12864">Xu et al., 2019 (Bochner/Mercer Time Embedding)</a>, <a href="https://arxiv.org/abs/1907.05321">Kazemi et al., 2019 (time2vec)</a>, <a href="https://papers.nips.cc/paper/2019/hash/952285b9b7e7a1be5aa7849f32ffff05-Abstract.html">Voelker et al., 2019 (Legendre Memory Units)</a> <a href="https://openreview.net/forum?id=whE31dn74cL">Xu et al., 2021 (Temporal Kernel)</a>, <a href="https://openreview.net/forum?id=mXbhcalKnYM">Shukla and Marlin, 2021 (Multi-time attention)</a></p> <h3 id="tree-positional-encoding">Tree Positional Encoding</h3> <p><a href="https://papers.nips.cc/paper/2019/hash/6e0917469214d8fbd8c517dcdc6b8dcf-Abstract.html">Shiv and Quirk, 2019</a></p> <h3 id="graph-positional-encodings-with-laplacian-eigenvectors">Graph Positional Encodings with Laplacian Eigenvectors</h3> <p><a href="https://arxiv.org/abs/2006.09963">Qiu et al., 2020 (GCC)</a>, <a href="https://arxiv.org/abs/2003.00982v3">Dwivedi et al., 2020 (Benchmarking GNNs)</a></p>Dongkwan Kimdongkwan.kim@kaist.ac.krSince I first saw the ‘Attention Is All You Need’ paper, I had a strong curiosity about the principle and theory of positional encoding. It is well understood that the Transformer did not have inductive biases for RNN architectures and thus introduced positional encoding. However, I have still not convinced how and why this works. The authors mentioned that they chose this design because of the special nature of sinusoid about the relative position, but it was not enough for me. … we hypothesized it would allow the model to easily learn to attend by relative positions, since for any fixed offset $k$, $PE_{pos+k}$ can be represented as a linear function of $PE_{pos}$. (Section 3.5 in Vaswani et al., 2017)Blogs and Newsletters about Machine Learning2021-02-04T00:00:00+09:002021-02-04T00:00:00+09:00https://dongkwan-kim.github.io/blogs/blogs-and-newsletter-about-machine-learning<p>Here is a list of blogs and newsletters about machine learning that I follow (or regularly read).</p> <h2 id="newsletters">Newsletters</h2> <ul> <li><a href="https://www.deeplearningweekly.com/">Deep Learning Weekly</a></li> <li><a href="https://ruder.io/nlp-news/">NLP News (Sebastian Ruder)</a></li> <li><a href="https://paperswithcode.com/newsletter/">PwC Newsletter</a></li> <li><a href="https://huggingface.curated.co/">Huggingface Issue</a></li> <li><a href="https://graphml.substack.com/">GML Newsletter</a></li> </ul> <h2 id="blogs-researchers">Blogs (Researchers)</h2> <ul> <li><a href="https://ruder.io/">Sebastian Ruder</a></li> <li><a href="https://lilianweng.github.io/lil-log/">Lil’Log</a></li> <li><a href="https://jalammar.github.io/">Jay’s ML Blog</a></li> <li><a href="https://amitness.com/">Amit Chaudhary</a></li> <li><a href="http://evjang.com/">Eric Jang</a></li> <li><a href="http://gregorygundersen.com/blog/">Gregory Gundersen</a></li> </ul> <h2 id="blogs-organization">Blogs (Organization)</h2> <ul> <li><a href="https://ai.googleblog.com/">Google AI</a></li> <li><a href="https://www.deepmind.com/blog">DeepMind</a></li> <li><a href="https://www.microsoft.com/en-us/research/blog/">Microsoft Research</a></li> <li><a href="https://bair.berkeley.edu/blog/">BAIR Blog</a></li> </ul>Dongkwan Kimdongkwan.kim@kaist.ac.krHere is a list of blogs and newsletters about machine learning that I follow (or regularly read).Handing Struct Error when Using Python Multi-processing Pool2020-09-12T00:00:00+09:002020-09-12T00:00:00+09:00https://dongkwan-kim.github.io/blogs/struct-error-python-multiprocessing<p>When using Python <code class="language-plaintext highlighter-rouge">multiprocessing.Pool</code> to process large data, I met this strange error.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>struct.error: 'i' format requires -2147483648 &lt;= number &lt;= 2147483647 </code></pre></div></div> <p>Looking at the internal structure, it seems that arguments are pickled before they are sent to child processes from the parent process. When the size of arguments is too large for pickling (maybe more than 2147483647), this kind of struct error occurs.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">multiprocessing</span> <span class="kn">import</span> <span class="n">Pool</span> <span class="k">def</span> <span class="nf">f</span><span class="p">(</span><span class="n">data_id</span><span class="p">,</span> <span class="n">large_data</span><span class="p">):</span> <span class="k">pass</span> <span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">'__main__'</span><span class="p">:</span> <span class="n">big_data</span> <span class="o">=</span> <span class="n">Data</span><span class="p">()</span> <span class="n">pool</span> <span class="o">=</span> <span class="n">Pool</span><span class="p">()</span> <span class="n">pool</span><span class="p">.</span><span class="n">starmap</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="p">[(</span><span class="n">i</span><span class="p">,</span> <span class="n">big_data</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">100</span><span class="p">)])</span> </code></pre></div></div> <p>I don’t know whether there is an elegant solution, but it can be solved by declaring this data globally.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">multiprocessing</span> <span class="kn">import</span> <span class="n">Pool</span> <span class="k">def</span> <span class="nf">f</span><span class="p">(</span><span class="n">data_id</span><span class="p">):</span> <span class="k">global</span> <span class="n">big_data</span> <span class="n">large_data</span> <span class="o">=</span> <span class="n">big_data</span> <span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">'__main__'</span><span class="p">:</span> <span class="n">big_data</span> <span class="o">=</span> <span class="n">Data</span><span class="p">()</span> <span class="n">pool</span> <span class="o">=</span> <span class="n">Pool</span><span class="p">()</span> <span class="n">pool</span><span class="p">.</span><span class="n">starmap</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="p">[(</span><span class="n">i</span><span class="p">,)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">100</span><span class="p">)])</span> </code></pre></div></div>Dongkwan Kimdongkwan.kim@kaist.ac.krWhen using Python multiprocessing.Pool to process large data, I met this strange error. struct.error: 'i' format requires -2147483648 &lt;= number &lt;= 2147483647Run Tensorboard (or Jupyter) in a Remote Docker Container2020-06-30T00:00:00+09:002020-06-30T00:00:00+09:00https://dongkwan-kim.github.io/blogs/tensorboard-in-a-docker-container<p>When creating a container, forward two ports 22 (for ssh) and 6006 (for Tensorboard).</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker run <span class="nt">-ti</span> <span class="nt">--runtime</span><span class="o">=</span>nvidia <span class="nt">-p</span> 8082:22 <span class="nt">-p</span> 8083:6006 nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04 /bin/bash </code></pre></div></div> <p>We can access a container with ssh by,</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh &lt;user&gt;@&lt;host&gt; <span class="nt">-p</span> 8082 </code></pre></div></div> <h2 id="access-through-localhost">Access through localhost</h2> <p>In a remote container, run Tensorboard with 6006 (the default port).</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tensorboard <span class="nt">--logdir</span> lightning_logs </code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>TensorFlow installation not found - running with reduced feature set. Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all TensorBoard 2.2.2 at http://localhost:6006/ (Press CTRL+C to quit) </code></pre></div></div> <p>In a local machine, bind local’s 8083 port to remote’s 6006 port.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh -L 8083:127.0.0.1:6006 &lt;user&gt;@&lt;host&gt; -p 8082 </code></pre></div></div> <p>Now, we can access the Tensorboard web interface using the address <code class="language-plaintext highlighter-rouge">localhost:8083</code> in a local machine.</p> <p>You can use Jupyter in the same way. Just change 6006 to 8888 (the default port in Jupyter).</p> <h2 id="access-through-ip-address-or-domain-name">Access through IP address or domain name</h2> <p>If you want to access the Tensorboard through the IP address or domain name of the server, add <code class="language-plaintext highlighter-rouge">--host 0.0.0.0</code> to tensorboard command.</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tensorboard <span class="nt">--logdir</span> lightning_logs <span class="nt">--host</span> 0.0.0.0 </code></pre></div></div> <p>You can access the Tensorboard page with <code class="language-plaintext highlighter-rouge">http://&lt;host&gt;:8083</code> on any kind of machine connected to the internet.</p>Dongkwan Kimdongkwan.kim@kaist.ac.krWhen creating a container, forward two ports 22 (for ssh) and 6006 (for Tensorboard). docker run -ti --runtime=nvidia -p 8082:22 -p 8083:6006 nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04 /bin/bashAll Possible Combinations of Row-wise Addition Using PyTorch2020-04-21T00:00:00+09:002020-04-21T00:00:00+09:00https://dongkwan-kim.github.io/blogs/all-possible-combinations-of-row-wise-addition-using-pytorch<p>How to get all possible combinations of row-wise addition (or substraction) using PyTorch? For example, if we have <code class="language-plaintext highlighter-rouge">x = [x0, x1, x2]</code> and <code class="language-plaintext highlighter-rouge">y = [y0, y1]</code>, our goal is computing <code class="language-plaintext highlighter-rouge">[x0 + y0, x0 + y1, x1 + y0, x1 + y1, x2 + y0, x2 + y1]</code>.</p> <p>Using PyTorch, the answer is:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">Tensor</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span> <span class="n">y</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">Tensor</span><span class="p">([</span><span class="mi">10</span><span class="p">,</span> <span class="mi">20</span><span class="p">])</span> <span class="s">""" tensor([10., 20., 11., 21., 12., 22.]) """</span> <span class="n">ans</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="p">.</span><span class="n">unsqueeze</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">y</span><span class="p">.</span><span class="n">unsqueeze</span><span class="p">(</span><span class="mi">0</span><span class="p">)).</span><span class="n">view</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> </code></pre></div></div> <p>Similary, for the multi-dimensional vectors,</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">Tensor</span><span class="p">([[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">],</span> <span class="p">[</span><span class="mi">6</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">8</span><span class="p">]])</span> <span class="n">y</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="n">Tensor</span><span class="p">([[</span><span class="mi">10</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">30</span><span class="p">],</span> <span class="p">[</span><span class="mi">40</span><span class="p">,</span> <span class="mi">50</span><span class="p">,</span> <span class="mi">60</span><span class="p">]])</span> <span class="s">""" tensor([[10., 21., 32.], [40., 51., 62.], [13., 24., 35.], [43., 54., 65.], [16., 27., 38.], [46., 57., 68.]]) """</span> <span class="n">ans</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="p">.</span><span class="n">unsqueeze</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">y</span><span class="p">.</span><span class="n">unsqueeze</span><span class="p">(</span><span class="mi">0</span><span class="p">)).</span><span class="n">view</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span> </code></pre></div></div>Dongkwan Kimdongkwan.kim@kaist.ac.krHow to get all possible combinations of row-wise addition (or substraction) using PyTorch? For example, if we have x = [x0, x1, x2] and y = [y0, y1], our goal is computing [x0 + y0, x0 + y1, x1 + y0, x1 + y1, x2 + y0, x2 + y1].Analytic Expressions of Indices for the Upper-triangle Matrix2020-03-07T00:00:00+09:002020-03-07T00:00:00+09:00https://dongkwan-kim.github.io/blogs/indices-for-the-upper-triangle-matrix<p>Recently, I implemented the negative edge sampling of undirected graphs for <a href="https://pytorch-geometric.readthedocs.io/">PyTorch Geometric</a>. Since $(i, j)$ and $(j, i)$ are the same edge in the undirected graph, I sample entries from the upper triangle of the given adjacency matrix. Easy solution is using <a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.triu_indices.html"><code class="language-plaintext highlighter-rouge">numpy.triu_indices</code></a>, but this suffers from $O(N^2)$ memory complexity (for $G = (V, E)$ and $N = |V|$) from the possible edge space $V \times V$.</p> <p>For the negative sampling of directed graphs, the existing code of PyTorch Geometric first converts the coordinates $(i, j)$ to linear (or consecutive) indices $N \cdot i + j$ and samples indices from $[ 0, N^2 - 1 ]$ (<code class="language-plaintext highlighter-rouge">range(N ** 2)</code>). Note that using python standard <code class="language-plaintext highlighter-rouge">random.sample</code> with <code class="language-plaintext highlighter-rouge">range</code> is really memory-efficient as the <a href="https://docs.python.org/3/library/random.html#random.sample">docs</a> says.</p> <p>In order to apply this to the undirected version of negative sampling, we need to convert the upper triangle matrix into linear indices. For example, the linear indices of the $N = 4$ graph’s upper triangle matrix will be:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0 1 2 3 - 4 5 6 - - 7 8 - - - 9 </code></pre></div></div> <p>Encoding the coordinate $(i, j)$ to the linear index $X$ is straightforward. The cumulative number of neglected entries for the $i$th row is $\sum_{x=0}^{i} x = \frac{i (i + 1)}{2}$, so substracting this value to the indices of the original matrix will be the solution.</p> <p>$X = Ni + j - \frac{i (i + 1)}{2}$.</p> <p>What about the inverse? Can we infer $i$ and $j$ from $X$? The answer is yes. There are analytic expressions of $i$ and $j$ in terms of $X$, and actually are well known results in StackOverflow [<a href="https://stackoverflow.com/a/53234021">1</a>, <a href="https://stackoverflow.com/a/244550">2</a>].</p> <p>$i = N - 1 - \left[\frac{-1 + \sqrt{(2N+1)^2 - 8 (X + 1)}}{2} \right]$</p> <p>$j = X - Ni + \frac{i (i + 1)}{2}$</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">i</span> <span class="o">=</span> <span class="n">N</span> <span class="o">-</span> <span class="mi">1</span> <span class="o">-</span> <span class="n">np</span><span class="p">.</span><span class="n">floor</span><span class="p">((</span><span class="o">-</span><span class="mi">1</span> <span class="o">+</span> <span class="n">np</span><span class="p">.</span><span class="n">sqrt</span><span class="p">((</span><span class="mi">2</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span> <span class="o">-</span> <span class="mi">8</span> <span class="o">*</span> <span class="p">(</span><span class="n">X</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)))</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span> <span class="n">j</span> <span class="o">=</span> <span class="n">X</span> <span class="o">-</span> <span class="n">i</span> <span class="o">*</span> <span class="p">(</span><span class="mi">2</span> <span class="o">*</span> <span class="n">N</span> <span class="o">-</span> <span class="n">i</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">//</span> <span class="mi">2</span> </code></pre></div></div> <p>My sincere friend, Chaehwan Song, explained to me how the above equations are derived. With her permission, I describe her solution in this post.</p> <p><img src="/images/upper-matrix.png" alt="upper-matrix" /></p> <p>The size of the shaded area is $\sum_{z=1}^{N} z - (X + 1) = \frac{N(N+1)}{2} - (X + 1)$, and from the right figure, we know that the following equation holds.</p> <p>$\sum_{z=1}^{N - 1 - i} z \leq \frac{N(N+1)}{2} - (X + 1) &lt; \sum_{z=1}^{N - i} z$</p> <p>$\Rightarrow \frac{(N-1-i)(N-i)}{2} \leq \frac{N(N+1)}{2} - (X + 1) &lt; \frac{(N-i)(N-i+1)}{2}$ (①)</p> <p>Let $i^c = N - 1 - i$ (for simplicity), $f(x) = \frac{x(x+1)}{2}$, and $k \in [0, N]$ s.t. $f(k) = \frac{N(N+1)}{2} - X$.</p> <p>Then, equation ① becomes $f(i^c) \leq f(k) &lt; f(i^c + 1)$.</p> <p>Since $f(x)$ is monotically increasing at $[0, N]$, we obatin $i^c \leq k &lt; i^c + 1$, which directly implies that $[ k ] = i^c$ holds (②).</p> <p>From the definition of $k$,</p> <p>$f(k) = \frac{k(k+1)}{2} = \frac{N(N+1)}{2} - (X + 1)$</p> <p>$\Rightarrow k = \frac{-1 \pm \sqrt{(2N+1)^2 - 8 (X + 1)}}{2}$,</p> <p>and proper $k \in [0, N]$ is $k = \frac{-1 + \sqrt{(2N+1)^2 - 8 (X + 1)}}{2}$.</p> <p>From ②, we have $i^c = \left[ \frac{-1 + \sqrt{(2N+1)^2 - 8 (X + 1)}}{2} \right]$.</p> <p>$\therefore i = N - 1 - \left[\frac{-1 + \sqrt{(2N+1)^2 - 8 (X + 1)}}{2} \right]$,</p> <p>and $j$ follows in a straightforward way.</p>Dongkwan Kimdongkwan.kim@kaist.ac.krRecently, I implemented the negative edge sampling of undirected graphs for PyTorch Geometric. Since $(i, j)$ and $(j, i)$ are the same edge in the undirected graph, I sample entries from the upper triangle of the given adjacency matrix. Easy solution is using numpy.triu_indices, but this suffers from $O(N^2)$ memory complexity (for $G = (V, E)$ and $N = |V|$) from the possible edge space $V \times V$.Install PyTorch with CUDA 10.02020-02-20T00:00:00+09:002020-02-20T00:00:00+09:00https://dongkwan-kim.github.io/blogs/install-pytorch-with-cuda-10<p>To fix PyTorch CUDA version error like below:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>RuntimeError: Detected that PyTorch and torch_sparse were compiled with different CUDA versions. PyTorch has CUDA version 10.1 and torch_sparse has CUDA version 10.0. Please reinstall the torch_sparse that matches your PyTorch install. </code></pre></div></div> <p>Install PyTorch with the right tag.</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pip <span class="nb">install </span><span class="nv">torch</span><span class="o">==</span>1.4.0+cu100 <span class="nt">-f</span> https://download.pytorch.org/whl/torch_stable.html </code></pre></div></div> <p>You can write cu92 or cu101 instead of cu100.</p> <p><a href="https://github.com/rusty1s/pytorch_geometric#installation">PyTorch Geometric</a> is now using a similar form of pip indexing.</p>Dongkwan Kimdongkwan.kim@kaist.ac.krTo fix PyTorch CUDA version error like below: RuntimeError: Detected that PyTorch and torch_sparse were compiled with different CUDA versions. PyTorch has CUDA version 10.1 and torch_sparse has CUDA version 10.0. Please reinstall the torch_sparse that matches your PyTorch install.Chrome Extensions for Computer Science Graduate Students2019-11-24T00:00:00+09:002019-11-24T00:00:00+09:00https://dongkwan-kim.github.io/blogs/chrome-extensions-for-computer-science-grads<p>A list of Chrome Extensions that might be useful to computer science graduate students.</p> <h2 id="general">General</h2> <h3 id="dark-reader"><a href="https://chrome.google.com/webstore/detail/dark-reader/eimadpbcbfnmbkopoojfekhnkhdbieeh">Dark Reader</a></h3> <p>Inverts brightness of web pages and aims to reduce eyestrain while browsing the web.</p> <h3 id="rooster-for-chrome"><a href="https://chrome.google.com/webstore/detail/rooster-for-chrome/pimolnhbniceppehbgmibnbgcnhpkhfh">Rooster for Chrome™</a></h3> <p>Stay productive by giving you insights on browsing habits and by notifying.</p> <h3 id="vertical-tabs"><a href="https://chrome.google.com/webstore/detail/vertical-tabs/pddljdmihkpdfpkgmbhdomeeifpklgnm">Vertical Tabs</a></h3> <p>Display tabs vertically in a sidebar on web pages. Support drag-and-drop, search, filter on tabs. (Sometimes this plugin creates a weird layout.)</p> <h2 id="codes">Codes</h2> <h3 id="octotree"><a href="https://chrome.google.com/webstore/detail/octotree/bkhaagjahfmjljalopjnoealnfndnagc">Octotree</a></h3> <p>Create easy-to-navigate code tree in Github, for file browsing and downloads.</p> <h3 id="refined-github"><a href="https://chrome.google.com/webstore/detail/refined-github/hlepfoohegkhhmjieoechaddaejaokhf">Refined GitHub</a></h3> <p>Simplifies the GitHub interface and adds useful features</p> <h3 id="code-finder-for-research-papers---catalyzex"><a href="https://chrome.google.com/webstore/detail/code-finder-for-research/aikkeehnlfpamidigaffhfmgbkdeheil">Code Finder for Research Papers - CatalyzeX</a></h3> <p>Finds and shows links to code implementations for research papers directly on Google, Arxiv, Scholar, Twitter, Github, and more.</p> <h2 id="reading-and-writing">Reading and Writing</h2> <h3 id="copy-bibtex-in-1-click-on-gscholar"><a href="https://chrome.google.com/webstore/detail/bib-%E2%80%94-copy-bibtex-in-1-cl/onnmdchfagapkggbhnnjkmllimegclnh/related">Copy BibTex in 1 click on GScholar</a></h3> <p>Copy BibTex from GScholar result page with 1 click!</p> <h3 id="bibtex-it"><a href="https://chrome.google.com/webstore/detail/bibtex-it/hofkoiddldajhihgjbckeffpodeoockc">BibTex It!</a></h3> <p>This extension gets your bibtex citation through your text selection within one-click.</p> <h3 id="redirectify"><a href="https://chrome.google.com/webstore/detail/redirectify/mhjmbfadcbhilcfdhkkepffbnjaghfie">Redirectify</a></h3> <p>Redirects requests for PDFs of papers at some popular sites to their corresponding HTML index pages. (For example, <a href="https://arxiv.org/pdf/1602.07527.pdf">this pdf link</a> redirects to <a href="https://arxiv.org/abs/1602.07527">this webpage</a>)</p> <h3 id="grammarly-for-chrome"><a href="https://chrome.google.com/webstore/detail/grammarly-for-chrome/kbfnbcaeplbcioakkpcpgfkobkghlhen">Grammarly for Chrome</a></h3> <p>Use Grammarly on Chrome.</p> <h3 id="overleaf-textarea"><a href="https://chrome.google.com/webstore/detail/overleaf-textarea/iejmieihafhhmjpoblelhbpdgchbckil/">Overleaf textarea</a></h3> <p>This plugin displays your text in a textarea so you can use spellcheck plugins like Grammarly. (It may not be that effective if your paper contains a bunch of equations.)</p> <h2 id="for-koreans">For Koreans</h2> <h3 id="showasis"><a href="https://chrome.google.com/webstore/detail/showasis/lehpimiaaocmlkoebjgkokglapbadpdh">ShowAsIs</a></h3> <p>Use CJK fonts in google slides.</p> <h3 id="find-korea-from-dropdowns"><a href="https://chrome.google.com/webstore/detail/find-korea-from-dropdowns/lfphjcfkgaiiojhbippbghhdikoibedi">Find Korea from Dropdowns</a></h3> <p>Find Korea when you go to academic conferences.</p> <h2 id="not-a-chrome-extension-but-still-useful">Not a Chrome Extension, but Still Useful</h2> <h3 id="auto-latex-equations"><a href="https://gsuite.google.com/marketplace/app/autolatex_equations/850293439076?pann=cwsdp&amp;hl=ko">Auto-LaTeX Equations</a></h3> <p>Instantly convert every math equation in Google Docs documents into latex images.</p>Dongkwan Kimdongkwan.kim@kaist.ac.krA list of Chrome Extensions that might be useful to computer science graduate students.