Treasure Hunt: An interestingness based study of the net/http Golang code, plus a proposal

Section 1: Introduction

This post is inspired by the method of search described in Why Greatness Cannot Be Planned: The Myth of the Objective. What happens when something is studied purely out of interestingness, without any major aims whatsoever? Are the results beneficial? If so, what can we do out of the insights generated?

1.1 How to read this post

I am writing this section, Section 1 at the very end of the exploratory & writing process to help you, the reader, process the information presented here.

The Section 2 of this post is to exhibit a genuine exploration effort in exactly the order of its discovery. Such an order may not be perfectly aiding in comprehension of the particular facts or in short term or long term memorization. However, I hope you make an attempt to see and appreciate how new facts, ideas and insights emerge out of no preconception during the discovery process. If you need a summary of the insights acquired from the exploration head directly over to Section 2.2

Section 3 is about meta-insights, meta-questions and such things emerging out of the exploratory efforts. These come at the end of post, and have been super valuable in raising the group's awareness around discovery mechanisms. In particular Sections 3.2 and 3.3 have a concrete proposal and call to action to improve group capabilities in epxloration.

Section 4 is a high level examination of the whole effort and results; one can consider it an executive summary of the exploratory effort.

Section 2: An illustrative treasure hunt of net/http - documented in the order of discovery

2.1 Example: Exploring the Golang “net/http” library

Why Golang?

I have been trying to make time to learn Golang for quite a while, informally porting the Java based Lox interpreter [L]  into a Go based Lox interpreter, glox. Go is important in the Hexmos team context too. So I am naturally geared towards learning about Go these days.

Why the “net/http” package?

The decision really is based on a gut feeling, a hunch, that “net/http” would light up the murky areas of concurrency to me in some way. I also hoped that maybe I’ll learn something about idiomatic Go code, coding conventions, commit message styles and whatnot. I am not particularly _looking for _anything, it’s just sort of like a treasure hunt, for me to collect interesting things in part of the repo [L]

A quick run through the commit history

The golang repo as a whole has > 50k commits in its master branch as of April 2022. The newer commit messages take a convention:

Part: message

Where Part may stand for areas of the source tree such as “doc”, “go/types”, “net/netip” and so on. The quality of messages vary -- some have detailed explanations on “why” (multiple paragraphs), whereas some commits do not even have an explanation. However, there are two useful links in each commit message:

  1. Link to the Github pull request (where one can usually find some sort of explanation and discussion)
  2. Link to the code review

The commit message + Github discussion + code review delivers context around any change.

I am using gitk [L] to display commits related to our directory of interest: src/net/http.

Some changes are tagged with the “all” Part. And this is quite often. The authors seem to be making a tremendous effort to keep the code comfortable to read. An example follows.

Attention to detail

Russ Cox fixed hanging indents in TODO comments throughout the repo in a single shot.

Author: Russ Cox <rsc@golang.org>  2022-01-31 05:52:46

Committer: Russ Cox <rsc@golang.org>  2022-04-01 23:48:05

Parent: 7d87ccc860dc31c0cd60faf00720e2f30fd37efb (all: fix various doc comment formatting nits)

Child:  690ac4071fa3e07113bf371c9e74394ab54d6749 (all: remove trailing blank doc comment lines)

Branches: master, remotes/origin/master

Follows: go1.18beta2

Precedes:

 

all: fix TODO comment hanging indents

  

For whatever reason (perhaps some tool does this), a handful of comments, including some doc comments, have TODOs formatted like:

  

       // TODO(name): Text here and

       //             more text aligned

       //             under first text.

  

   In doc comments the second line turns into a <pre> block,

   which is undesirable in this context.

  

   Rewrite those to unindent, like this instead:

  

       // TODO(name): Text here and

       // more text aligned

       // at left column.

  

   For #51082.

  

Change-Id: Ibf5145659a61ebf9496f016752a709a7656d2d4b

Reviewed-on: https://go-review.googlesource.com/c/go/+/384258

Trust: Russ Cox <rsc@golang.org>

Run-TryBot: Russ Cox <rsc@golang.org>

Reviewed-by: Ian Lance Taylor <iant@golang.org>

TryBot-Result: Gopher Robot <gobot@golang.org>

Stress test the library through specialised tooling to find hidden bugs

"Given enough eyes, all bugs are shallow"   --Linus’s Law [L]

Firstly, we can see here that Damien Neil was the committer whereas Maisem Ali seems to be the author. I believe Damien Neil is a maintainer or core contributor.

Author: Maisem Ali <maisem@tailscale.com>  2022-03-21 23:13:45

Committer: Damien Neil <dneil@google.com>  2022-03-25 21:38:06

Parent: a10a209b23f877c80d8a5f3ebda1ce4b492ac3a9 (net/http/httputil: ignore CloseNotify when a non-background context is present)

Child:  a41763539c7ad09a22720a517a28e6018ca4db0f (net/http: handle 3xx responses with no Location)

Branches: master, remotes/origin/dev.boringcrypto, remotes/origin/master

Follows: go1.18beta2

Precedes:

 

  ** ****net/http/httptest****: ****fix race in Server.Close**

  

   When run with **race detector** the test fails without the fix.

  

   Fixes #51799

  

   Change-Id: I273adb6d3a2b1e0d606b9c27ab4c6a9aa4aa8064

   GitHub-Last-Rev: a5ddd146a2a65f2e817eed5133449c79b3af2562

   GitHub-Pull-Request: golang/go#51805

   Reviewed-on: https://go-review.googlesource.com/c/go/+/393974

   Reviewed-by: Damien Neil <dneil@google.com>

   Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

   Trust: Brad Fitzpatrick <bradfitz@golang.org>

   Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>

We see that the author applied race detector [L] to find a bug in Server.Close. Race detector comes built in with Go 1.1 and can be used as follows:

$ go test -race mypkg    // test the package

$ go run -race mysrc.go  // compile and run the program

$ go build -race mycmd   // build the command

$ go install -race mypkg // install the package

How does it work? With the --race flag enabled, the runtime watches memory for unsynchronized access to shared variables in practice. Hence, when run on realistic workloads, hopefully the bug pops up.

The particular scenario is better explored through the test case from Maisem Ali:

// Issue 51799: test hijacking a connection and then closing it

// concurrently with closing the server.

func TestCloseHijackedConnection(t \*testing.T) {

   hijacked := make(chan net.Conn)

   ts := NewServer(http.HandlerFunc(func(w http.ResponseWriter, r \*http.Request) {

       defer close(hijacked)

       hj, ok := w.(http.Hijacker)

       if !ok {

           t.Fatal("failed to hijack")

       }

       c, \_, err := hj.Hijack()

       if err != nil {

           t.Fatal(err)

       }

       hijacked <- c

   }))

 

   var wg sync.WaitGroup

   wg.Add(1)

   go func() {

       defer wg.Done()

       req, err := http.NewRequest("GET", ts.URL, nil)

       if err != nil {

           t.Log(err)

       }

       // Use a client not associated with the Server.

       var c http.Client

       resp, err := c.Do(req)

       if err != nil {

           t.Log(err)

           return

       }

       resp.Body.Close()

   }()

 

   wg.Add(1)

   conn := <-hijacked

   go func(conn net.Conn) {

       defer wg.Done()

       // Close the connection and then inform the Server that

       // we closed it.

       conn.Close()

       ts.Config.ConnState(conn, http.StateClosed)

   }(conn)

 

   wg.Add(1)

   go func() {

       defer wg.Done()

       ts.Close()

   }()

   wg.Wait()

}

Useful links for exploration:

Channels are the pipes that connect concurrent goroutines. You can send values into channels from one goroutine and receive those values into another goroutine.

Example:

messages := make(chan string)

Channels are typed by the values they convey.

httptest/Server.go

The server is written specifically for testing. Server S listens on System chosen port.

Helpful comments with great attention to detail. An example:

URL      string // base URL of form http://ipaddr:port with no trailing slash

Now come two members of the “sync” package:

   // wg counts the number of outstanding HTTP requests on this server.

   // Close blocks until all requests are finished.

   wg sync.WaitGroup

 

   mu     sync.Mutex // guards closed and conns

I have not heard of WaitGroup and am less conversant with Mutex (although have used it for sample programs and have played around with them).

sync.WaitGroup

Learn about WaitGroup in Go By Example: Go by Example: WaitGroups

  • To wait for multiple goroutines to finish, use WaitGroup

  • WaitGroup is a type of counter

  • wg.Add(1) increments the counter (?)

  • Here comes a mysterious statement:

  • “Avoid re-use of the same i value in each goroutine closure”

    • Ideas

    • Closure

    • Concurrency
  • Advantage of WaitGroups:

  • With the WaitGroup pattern, the worker itself can be unaware of the concurrency primitives being used

  • Disadvantage

  • Difficult to implement error propagation. Hence for advanced uses, ErrGroup is recommended. 

Problem 1: Goroutines & Closures

  • Guessing game:

  • What’s the output of the following program?

func main() {

   done := make(chan bool)

 

   values := \[]string{"a", "b", "c"}

   for \_, v := range values {

       go func() {

           fmt.Println(v)

           done <- true

       }()

   }

 

   // wait for all goroutines to complete before exiting

   for \_ = range values {

       <-done

   }

}

“A”, “b”, “c”? Nope. that’s not how it works.

The output is “probably” “c”, “c’, “c”. Why? In each iteration, we use the same “instance” of the variable v. Each “closure” shares the same variable. And println prints the “present value” (which is ‘c’).

How to detect the above issue?

Go vet

Running “go vet” on the last code example spits out the following warning:

~/bin/treasure_hunt ⑂master\* $ go vet goroutine_closure.go 

./goroutine_closure.go:15:25: loop variable v captured by func literal

Solution 1: Capture variables within goroutine through goroutine arguments

   for _, v := range values {

       go func(u string) {

           fmt.Println(u)

           done &lt;- true

       }(v)

   }

Solution 2: Just redefine the variable within the closure

   for _, v := range values {

       v := v // create a new 'v'.

       go func() {

           fmt.Println(v)

           done <- true

       }()

   }

Go Developers admit that the concurrency/closure problem is a mistake on their part (and may fix it in the future)

This behavior of the language, not defining a new variable for each iteration, may have been a mistake in retrospect. It may be addressed in a later version but, for compatibility, cannot change in Go version 1.

-- Go FAQ

Understanding the key bug due to incorrect use of WaitGroup

[Full Size Image]

  • A giant function is attached to the server.Config object
  • The function takes in  a connection and a connection state
  • The function throughout is mutex locked (only released at the expiry of the function scope). The mutex used is in the “mu” property of the Server struct.
  • At a broad level, what’s the function trying to do?
  • It is inspecting the connection state object, and taking actions
  • 5 states possible
    • StateNew
    • StateActive
    • StateIdle
    • StateHijacked
    • StateClosed
  • The code changes are to two types of handling
    • StateNew
    • StateHijacked
  • “Wrap” is for installing “hooks” for various lifecycle methods of a connection
  • Hooks can be “replaced”
  • When the old hook is being replaced with a new one, we ensure to run the hook once
  • This is getting difficult to follow, so I’m going to get additional context by reading the test case

The test case:

[Full sized image]

  • Create a channel “hijacked” of the net.Conn type
  • Create a new server
  • The handler function uses the “hijacked” channel to send the “close” signal at the end
  • Looks like I do not understand what “w.(http.Hijacker)” means. Weird syntax.
  • Anyway, we get an hj (highjacker) object
  • The response from the highjacker object is channeled into “hijacked”
  • We create a waitgroup, increment by 1
  • Launch a goroutine “under the weightgroup” that does
  • GET request
  • Use a client not associated with server (I think I saw something related to this before, essentially the server only gives some sort of client handler and we’re supposed to use that?)
  • The client does the GET request
  • Close connection
  • Meanwhile, we increment wg again
  • We get the “conn” object from the “hijacked” channel
  • Then we launch another goroutine under wg

    • We close the connection
    • And set call the procedure for http.closed (hook)
    • This will call “forgetConnection” or something [?]
  • And the third piece is, we launch another goroutine under wg

  • We simply close the server (not connection)

  • Open question

  • There is a symbol “c” within the “wrap” function. Still couldn’t figure out where it’s coming from. 
  • It is necessary to understand the new “hijack handling” code
  • Alright, I missed it, “c” is sent as part of setting/replacing the hooks
  • c is of type net.Conn
  • Tricky syntax for if:
    • if _, exists := s.conns[c]; exists {
  • Get data, use “;” to check value existence in the same line
  • If it’s a new state the connection claims to be in, then that cannot exist in s.conns

Golang uses mutex widely. Another commit used to achieve mutual exclusion. Seems worthwhile to see some examples of mutex usage. 

Mutex Example

We use the mutex “c.mu” to Lock before an atomic operation and set a defer for the unlock operation at the end of the atomic operation. 

package main

import (
   "fmt"
   "sync"
)

 

type Container struct {
   mu       sync.Mutex
   counters map\[string]int
}

 

func (c \*Container) inc(name string) {
   c.mu.Lock()
   defer c.mu.Unlock()
   c.counters\[name]++
}

 

func main() {
   c := Container{
       counters: map\[string]int{"a": 0, "b": 0},
   }

   var wg sync.WaitGroup
   doIncrement := func(name string, n int) {
       for i := 0; i &lt; n; i++ {
           c.inc(name)
       }
       wg.Done()
   }

   wg.Add(3)
   go doIncrement("a", 10000)
   go doIncrement("a", 10000)
   go doIncrement("b", 10000)

   wg.Wait()

   fmt.Println(c.counters)

}

Another Race Condition handling in from net/http. Link to the commit (same as the one specified a bit earlier).

The description reads:

Author: Daniel Morsing &lt;daniel.morsing@gmail.com>  2015-04-21 03:32:07

Committer: Daniel Morsing &lt;daniel.morsing@gmail.com>  2015-04-22 17:53:55

Parent: 5fa2d9915f8311d7996e93a3a42cf438278e3886 (net/http&#x3A; make ServeContent errors return more specific HTTP status codes)

Child:  5ed44e9d4d06091a9e53128534ce389eb9754e80 (net/http&#x3A; test and document suppressing implicit Content-Type response header)

Branches: master, remotes/origin/master and many more (41)

Follows: go1.4beta1

Precedes: go1.5beta1

    net/http&#x3A; **fix race between dialing and canceling**

    

    In the brief window between **getConn** and **persistConn.roundTrip**,

    a cancel could end up going missing.

    

    Fix by making it possible to inspect if a cancel function was cleared

    and checking if we were canceled before entering roundTrip.

    

    Fixes #10511

    

Change-Id: If6513e63fbc2edb703e36d6356ccc95a1dc33144

Reviewed-on: https&#x3A;//go-review.googlesource.com/9181

Reviewed-by: Brad Fitzpatrick &lt;bradfitz@golang.org>

The critical section in this commit is wrapped with a request mutex:

func (t \*Transport) replaceReqCanceler(r \*Request, fn func()) bool {

   t.reqMu.Lock()

   defer t.reqMu.Unlock()

   \_, ok := t.reqCanceler\[r]

   if !ok {

       return false

   }

   if fn != nil {

       t.reqCanceler\[r] = fn

   } else {

       delete(t.reqCanceler, r)

   }

   return true

}

Essentially we are modifying an array slice, which seems to be shared across multiple connections. So deleting elements from this array must happen with the atomic guarantees provided by a mutex.

The above function is called by the roundtrip function as follows. This method essentially will return errRequestCancelled, since the replaceReqCanceler will return false due to the cancel call.

func (pc persistConn) roundTrip(req transportRequest) (resp Response, err error) {

   if hook := testHookEnterRoundTrip; hook != nil {

       hook()

   }

   if !pc.t.replaceReqCanceler(req.Request, pc.cancelRequest) {

       pc.t.putIdleConn(pc)

       return nil, errRequestCanceled

   }

   ...

}

In the following function, we see the following steps:

  1. Create a test server
  2. Create transport object
  3. Create new request
  4. Setup a “round trip hook” to simply cancel the request using the transport object
  5. Then enter the roundtrip phase
  6. Due to (4), we expect the result to throw out “ExportErrRequestCanceled”
func TestTransportDialCancelRace(t testing.T) {
   defer afterTest(t)
   ts := httptest.NewServer(HandlerFunc(func(w ResponseWriter, r \*Request) {}))
   defer ts.Close()
   tr := &Transport{}
   defer tr.CloseIdleConnections()
   req, err := NewRequest("GET", ts.URL, nil)
   if err != nil {
       t.Fatal(err)
   }
   SetEnterRoundTripHook(func() {
       tr.CancelRequest(req)
   })
   defer SetEnterRoundTripHook(nil)
   res, err := tr.RoundTrip(req)
   if err != ExportErrRequestCanceled {
       t.Errorf("expected canceled request error; got %v", err)
       if err == nil {
           res.Body.Close()
       }
   }
}

Observations on code organisation of net/http

  • There is a huge number of test files:
  • The root folder for “net/http” itself has 26 files having the substring “_test” in their names
  • In contrast, I found 23 source files which are non-test files
  • net/http has a focus on exhaustive testing
  • There are 9 “submodules” of sorts. 
  • Each “submodule” file has its own package name. Example
    • Package cgi
  • Some of the “submodules” are independent of the main “http” module. Example
    • Cgi
    • Fcgi
  • Whereas at least one submodule is imported in the “http” module
    • Cookiejar
  • Imports are done through the main module name, but the “dot” operator is used to access the local files. Example “client_test.go” import statements

    • "net"
    • . "net/http"
    • "net/http/cookiejar"
    • "net/http/httptest"
  • A python analogy:

  • Source of the analogy: What does the '.' (dot or period) in a Go import statement do? - Stack Overflow

  • Came across a package for inspecting or “tracing” http client requests at a fine-grain level: net/http/httptrace

  • The httptrace package provides a number of hooks to gather information during an HTTP round trip about a variety of events. These events include:

    • Connection creation
    • Connection reuse
    • DNS lookups
    • Writing the request to the wire
    • Reading the response
  • An example tool built out of httptrace is httpstat

    • Cool breakdown of overall request  duration

Examples/Source Code

https://gitlab.com/flyweightgroup/experimental/treasurehunts

2.2 Summary of the illustrative exploration

  1. Learn about real-world usage of “WaitGroup”
  2. Insights into language design. In particular, the relationship between basic behaviour of loop variables, and Goroutines.
  3. Exposure to numerous race issues that have emerged in the “net/http” library in the past, how specialised tools and community scrutiny together have resolved them.
  4. Deepening the understanding of mutexes through trying out examples and also exposure to real-world go library code using them
  5. Learning about 3 new tools:
  6. Go Vet: Catch possible issues in the code based on heuristics
  7. Go httptrace: Library functionality providing hooks into http lifecycle
  8. Go httpstat: A useful library using httptrace to show waterfall charts of a request in the terminal
  9. Some valuable lessons on Attention To Detail, Continuous Refactoring/Improvement
  10. Realisation of importance of an exhaustive test suite, especially for heavily concurrent programs
  11. Insights into how to organise a larger Go project, submodules, appropriate usage of the dot operator (“.”)
  12. Spawning the broader question: “How to get others interested in Exploratory Studies?” leading to OSED (Open Source Exploration Drive). OSED has potential to transform the capabilities and habits of the group towards higher sophistication. 

Section 3: Ramifications of the exploratory method and related proposal

3.1 Conducting consistent exploration at Hexmos and its possibilities

As I try to reflect on why this essay has been attempted two things come to mind:

  1. Get others interested to explore one project on their own
  2. Learn something about Golang ecosystem (tools, people, practices)

Section 2 addressed the second part of this requirement. This section will focus on progressing on the first part of the requirement.

The key question is: How can this essay get others interested to explore one project on their own? We take the Polya way, and try to find a related example.

Key question: What is an example of related exploration that happened in history?

  1. Through some wandering over the webs, we get to "The age of exploration" -> A Brief History of the Age of Exploration.
  2. Summary: The Europeans acquired wealth, knowledge, dominance and (spices)  in the past 500-600 years due to the power of persistent exploration. At the root of European’s wealth and power is a commitment to exploration.

When: 15 - 17th century

What: Europeans began exploring the world by sea in search of new trading routes, wealth and knowledge (and spices)

Impacts:

  1. Europeans accumulated massive wealth
  2. Methods of navigation and mapping improved (“Necessity of better tools for navigating and conquering led to innovation in science”  - Nowke)
  3. Spread of foods, plants, animals across
  4. Decimation of indigenous people
  5. Led to slavery
  6. Former colonies still held in separation, branded "developing"

Analogy/Question: How can we "harness" open source through "exploration"?

Can wealth be generated out of a mastery of open source software?

Can knowledge be generated out of a mastery of open source software?

The europeans generated enormous wealth for themselves through exploration of the world. Can Hexmos use this analogy to "explore", "use" and “enrich” open source software?

3.2 Open Source Exploration Drive (OSED) Proposal

Applying the Age of Discovery ideas to Open Source exploration at Hexmos

We can encourage everyone to do persistent searches for finding beneficial tools & open source communities. We can decide the frequency (bi-weekly updates?). Some fields I could come up with are:

  1. Project or community name
  2. Project or community links
  3. What is the possibility of benefit to the group?
  4. What is the price to be paid in your opinion?
  5. How did you find this?
  6. Instructions on how to get started (you must have validated the method before reporting/instructing)

3.3 Call to action

Explore the open source space for a while based on the group perspective and fill out a few rows in the following sheet. We will have a review of this sheet every Wednesday and Sunday.

Open Source Exploration Drive Google Sheet

Section 4: Reflections about the “treasure hunt” approach

Reflection about the “treasure hunt” approach:

  1. The most important outcome of this exploration, in my opinion is: OSED (Open Source Exploration Drive). The OSED emerged out of a question: “How to get others interested in exploring at least one more project on their own?” The question wouldn’t have been asked if I was not in the context, trying to explore something. I didn’t plan to ask this question. It sort of happened. And such questions cannot be “planned” to be asked. The power of exploration has to be experienced, since it cannot be easily explained in a convincing way.
  2. The second important outcome of this exploration is a psychological appreciation of open source: What a treasure trove open source really is: it is the confluence of learning by doing, continuous creative exertion, cooperating with a friendly community, service and finally, of producing huge amounts of utilitarian value to the world at large. One can both  “earn” and “learn” more if only one replaces dull entertainment such as social media and tv with some valuable open source exploration and contribution activity.
  3. It is really difficult to summarise exploratory activities under any presentation order. The facts may seem too desperate and unrelated at times. 

Section 5: Appendix

5.1 Further study

Google I/O 2012 - Go Concurrency Patterns

Concurrency in Go: Tools and Techniques for Developers : Cox–buday, Katherine: Amazon.in: Books

5.1 Works Cited

“gitk Documentation.” Git, https://git-scm.com/docs/gitk/. Accessed 3 April 2022.

“golang/go · GitHub.” GitHub, https://github.com/golang/go/tree/master/src/net/http. Accessed 3 April 2022.

“The Lox Language · Crafting Interpreters.” Crafting Interpreters, https://craftinginterpreters.com/the-lox-language.html. Accessed 3 April 2022.

Raymond, Eric S., and Linus Torvalds. “Linus's law.” Wikipedia, https://en.wikipedia.org/wiki/Linus%27s_law. Accessed 3 April 2022.

Vyukov, Dmitry, and Andrew Gerrand. “Introducing the Go Race Detector - The Go Programming Language.” Golang, 26 June 2013, https://go.dev/blog/race-detector. Accessed 3 April 2022.