Introduction to HTTP#
Input URL -> browser process handles input information -> browser kernel initiates request to server -> browser kernel reads response -> browser kernel renders -> browser process page loading completed
-
Hyper Text Transfer Protocol (HTTP)
-
It is an application layer protocol, based on the transport layer TCP protocol
-
Request, response
-
Simple extensible (custom request headers can be defined as long as both client and server can understand)
-
Stateless
Protocol Analysis#
Development History#
Message Structure#
HTTP/1.1#
As shown in the figure, you can see the request and response headers, the returned status code, etc.
Method | Description |
---|---|
GET | Requests a representation of a specified resource; GET requests should only be used to retrieve data |
POST | Used to submit an entity to a specified resource, usually resulting in a change in state or side effects on the server |
PUT | Replaces all current representations of the target resource with the request payload |
DELETE | Deletes the specified resource |
HEAD | Requests a response identical to that of a GET request, but without a response body (less commonly used) |
CONNECT | Establishes a tunnel to the server identified by the target resource. (less commonly used) |
OPTIONS | Used to describe the communication options for the target resource. |
TRACE | Performs a message loop-back test along the path to the target resource. (less commonly used) |
PATCH | Used to apply partial modifications to a resource. |
-
Safe: Methods that do not modify server data, such as reading data GET, HEAD, OPTIONS, etc.
-
Idempotent: The effect of executing the same request once is the same as executing it multiple times; the server's state remains the same. All safe methods are idempotent, such as GET, HEAD, OPTIONS, PUT, DELETE, etc.
Status Codes#
- 200 OK - Client's request succeeded
- 301 - Resource (webpage, etc.) has been permanently moved to another URL
- 302 - Temporary redirect
- 401 - Unauthorized - Request not authorized
- 404 - Requested resource does not exist, possibly due to an incorrect URL
- 500 - An unexpected error occurred on the server
- 504 Gateway Timeout - The gateway or proxy server could not get the desired response in the allotted time
RESTful API#
A style of API design: REST - Representational State Transfer
- Each URI represents a resource
- Between client and server, a certain representation of this resource is passed
- The client operates on server-side resources through HTTP methods, achieving "representation state transformation".
Request | Return Code | Meaning |
---|---|---|
GET /zoos | 200 OK | Lists all zoos, server successfully returned |
POST /zoos | 201 CREATED | Creates a new zoo, server creation successful |
PUT /zoos/ID | 400 INVALID REQUEST | Updates information for a specified zoo (providing all information for that zoo); the user's request has an error, and the server did not create or modify data |
DELETE /zoos/ID | 204 NO CONTENT | Deletes a specified zoo, deletion successful |
Common Request Headers#
Request Header | Description |
---|---|
Accept | Acceptable types, indicating the MIME types supported by the browser (corresponding to the Content-Type returned by the server) |
Content-Type | The type of entity content sent by the client |
Cache-Control | Specifies the cache mechanism to be followed by requests and responses, such as no-cache |
If-Modified-Since | Corresponds to the server's Last-Modified, used to check if the file has changed, can only be accurate to within 1 second |
Expires | Cache control; will not request during this time, directly using the cache, server time |
Max-age | Represents how many seconds the resource should be cached locally, will not request during the valid time, but use the cache |
If-None-Match | Corresponds to the server's ETag, used to check if the file content has changed (very precise) |
Cookie | Cookies will be automatically sent when accessing the same domain |
Referer | The source URL of the page (applies to all types of requests, will be precise to the detailed page address, commonly used for CSRF interception) |
Origin | Where the initial request was initiated from (will only be precise to the port), Origin respects privacy more than Referer ** |
User-Agent | Necessary information about the user client, such as UA header, etc. |
Common Response Headers#
Response Header | Description |
---|---|
Content-Type | The type of entity content returned by the server |
Cache-Control | Specifies the cache mechanism to be followed by requests and responses, such as no-cache |
Last-Modified | The last modification time of the requested resource |
Expires | When to consider the document expired and no longer cache it |
Max-age | How many seconds the client's local resource should be cached; effective after Cache-Control is enabled |
ETag | An identifier for a specific version of the resource, similar to a fingerprint |
Set-Cookie | Sets the cookie associated with the page, the server sends the cookie to the client through this header |
Server | Some related information about the server |
Access-Control-Allow-Origin | The allowed Origin header for requests on the server side (e.g., *) |
Caching#
Strong Caching
Use directly if available locally
- Expires (expiration time), timestamp
- Cache-Control
- Cacheability
- no-cache: Negotiated cache validation
- no-store: Do not use any cache
- public, private, etc.
- Expiration
- max-age: measured in seconds, the maximum lifespan of stored data, relative to the request time
- Revalidation *reload
- must-revalidate: Once the resource expires, it cannot be used until successfully validated with the original server.
- Cacheability
Negotiated Caching
Communicate with the server to determine whether to use it
- Etag/If-None-Match: An identifier for the specific version of the resource, similar to a fingerprint
- Last-Modified/If-Modified-Since: Last modification time. (absolute)
Cookies#
Set-Cookie - response
Name=value | Various cookie names and values |
---|---|
Expires=Date | The validity period of the cookie; by default, the cookie is only valid until the browser is closed. |
Path= Path | Limits the file directory that specifies the sending range of the cookie, defaulting to the current one |
Domain=domain | Limits the domain name where the cookie is effective, defaulting to the service domain name that created the cookie |
secure | The cookie can only be sent over HTTPS secure connections |
HttpOnly | JavaScript scripts cannot access the cookie |
SameSite=[None|Strict|Lax] | None allows both same-site and cross-site requests; Strict only sends on the same site; allows sending with top-level navigation and with GET requests initiated by third-party websites |
Development#
Overview of HTTP/2: Faster, more stable, simpler
-
Frame
-
The smallest unit of communication in HTTP/2, each frame contains a frame header, which at least identifies the data stream to which the current frame belongs.
-
Version 1.0 transmits text, while version 2 transmits binary data, which is more efficient. It also has a new compression algorithm.
-
-
-
Message: A complete series of frames corresponding to a logical request or response message.
-
Data Stream: A bidirectional byte stream within an established connection that can carry one or more messages.
-
Interleaved sending, the receiver reorganizes.
-
-
HTTP/2 connections are all permanent, and only one connection is needed for each origin.
-
Flow control: A mechanism to prevent the sender from sending a large amount of data to the receiver.
-
Server push
Overview of HTTPS#
-
HTTPS: Hypertext Transfer Protocol Secure
-
Encrypted via TSL/SSL
-
Symmetric encryption: Both encryption and decryption use the same key
-
Asymmetric encryption: Encryption and decryption require two different keys: a public key and a private key
Common Scenario Analysis#
Static Resources#
Taking Toutiao as an example, open the network panel to view its requests and find the request for the CSS file.
You can see that the returned status code is 200, so was a request really initiated? (The parentheses next to it say, from disk cache)
From the response headers in the above image, we can see:
- Cache strategy?
- Strong cache (max-age=xxxxx)
- Cache-control: calculated to be 1 year
- Strong cache (max-age=xxxxx)
- Other information?
- Allows access from all domains (access-control-allow-origin)
- Resource type: css (content-type)
Static resource solution: cache + CDN + file name hash
- CDN: Content Delivery Network
- By judging user proximity and server load, CDN ensures that content is served to user requests in a highly efficient manner.
With such a long cache period, how can we ensure that the content users receive is up-to-date?
File name hash: when the file content changes, the file name changes/adds a version number, so the cached file cannot match and must be requested again.
Login - Cross-Domain#
Cross-domain issues lead to the request method being OPTIONS.
Protocol, hostname, port differing in any one will cause a cross-domain issue (the default port number for HTTP is 443).
Solving Cross-Domain Issues#
-
Cross-Origin Resource Sharing (CORS)
-
Cross-Origin Resource Sharing
(CORS) is a mechanism based on HTTP headers that allows servers to indicate that resources can be requested from a different origin (domain, protocol, and port) than their own. CORS also includes a mechanism to check whether the server will allow the actual request to be sent, by initiating a "preflight" request to the server hosting the cross-origin resource. In the preflight, the headers sent by the browser indicate the HTTP methods and headers that will be used in the actual request.For security reasons, browsers restrict cross-origin HTTP requests initiated from scripts. For example,
XMLHttpRequest
and the Fetch API adhere to the same-origin policy. This means that web applications using these APIs can only request HTTP resources from the same domain that loaded the application, unless the response includes the correct CORS response headers. -
Pre-request: To find out if the server allows the cross-origin request (complex request)
-
Related protocol headers
- access-control-....
-
-
Proxy Server
- The same-origin policy is a security policy of the browser, not HTTP.
-
Iframe many inconveniences
As shown in the figure, what actions were taken during login?
- Used the POST method
- Target domain: https://sso.toutiao.com
- Target: path/quick_login/v2/
What information was carried, and what information was returned?
- Carried information
- Post body, data format is form
- Desired data format is json
- Existing cookies
- Returned information
- Data format json
- Cookie information
So why can the login state be remembered the next time the page is accessed?
Authentication#
- Session + cookie (most portal websites use this)
- The user submits a request to the server, including username and password, etc.
- The server processes and verifies correctness; if correct, it returns a session and sets it in the cookie (Set-Cookie: session = ......)
- When the user sends again: GET Cookie: session=....
- The server processes the verification and returns some login information.
- JWT (JSON Web Token)
- The server does not store it locally.
- The returned token is unique, with a short login time, etc.
- SSO: Single Sign-On
As shown in the figure, it is explained very clearly.
Practical Applications#
XMLHttpRequest - Web API Interface Reference | MDN (mozilla.org)
AJAX** and **XHR#
- XHR: XMLHttpRequest
- readyState
0 | UNSENT | The proxy has been created, but open() has not yet been called. |
1 | OPENED | The open() method has been called. |
2 | HEADERS_RECEIVED | The send() method has been called, and the headers and status are available. |
3 | LOADING | Downloading; the responseText property contains some data. |
4 | DONE | The download operation is complete. |
AJAX and Fetch#
- An upgraded version of XMLHttpRequest
- Uses Promises
- Modular design, Response, Request, Header objects
- Supports chunked reading through data stream processing objects
Standard Library in Node: HTTP/HTTPS#
- Default module, no need to install other dependencies
Limited functionality / not very user-friendly
Common Request Library: axios#
- Getting Started | Axios Chinese Documentation
- Supports browser and nodejs environments
- Rich interceptors
// Global configuration
axios.defaults.baseURL = "https://api.example.com";
// Add request interceptor
axios.interceptors.request.use(function (config) {
// Do something before sending the request
return config;
}, function (error) {
// Do something with request error
return Promise.reject(error);
});
// Send request
axios({
method: 'get',
url: 'http://test.com',
responseType: 'stream'
}).then(function(response) {
response.data.pipe(fs.createWriteStream('ada_lovelace.jpg'));
});
Network Optimization#
-
HTTP/2 - A Real-World Performance Test and Analysis | CSS-Tricks - CSS-Tricks
-
Pre-resolving, pre-connecting, etc.
-
Retrying is an effective means to ensure stability, but it should be prevented from exacerbating adverse situations (e.g., when the network connection is broken).
-
Reasonable use of caching serves as the last line of defense.
Learn More#
More than One Choice for HTTP Protocol#
Extension - Communication Methods#
WebSocket#
- A network technology for full-duplex communication between the browser and server
- Typical scenario: high real-time requirements, such as chat rooms
- URL starts with ws:// or wss://
UDP#
QUIC: Quick UDP Internet Connection based on UDP
- 0-RTT connection establishment (except for the first connection).
- Reliable transmission similar to TCP.
- Encrypted transmission similar to TLS, supporting perfect forward secrecy.
- User-space congestion control, latest BBR algorithm.
- Supports stream-based multiplexing similar to h2, but without TCP's HOL problem.
- Forward error correction (FEC).
- Connection migration similar to MPTCP.
- Not many applications yet.
Summary and Thoughts#
Today, the instructor's gentle voice introduced HTTP and its common protocol analysis, message structure, cache strategy analysis, and explained its specific business scenario usage.
Most of the content cited in this article comes from Teacher Yang Chaonan's class - HTTP Practical Guide.