编程知识 cdmana.com

Let you read it JAVA.IO , character encoding, URL and Spring.Resource

1 JAVA.IO Byte stream

 The basic chapter : A text for you to read JAVA.IO、 Character encoding 、URL and Spring.Resource

 

inputstream.png

  • LineNumberInputStream and StringBufferInputStream Official advice not to use , Recommended LineNumberReader and StringReader Instead of
  • ByteArrayInputStream and ByteArrayOutputStream Byte array processing stream , Create a buffer in memory to be used as a stream , Reading data from the cache is better than reading from the storage medium ( Disk ) It's fast
// use ByteArrayOutputStream Temporarily cache data from other channels 
ByteArrayOutputStream data = new ByteArrayOutputStream(1024); //1024 Byte size cache 
data.write(System.in.read()); //  Temporary storage of user input data 

// take data To ByteArrayInputStream
ByteArrayInputStream in = new ByteArrayInputStream(data.toByteArray());
  • FileInputStream and FileOutputStream access files , Document as InputStream, Read and write the file
  • ObjectInputStream and ObjectOutputStream Object flow , The constructor needs to pass in a stream , Realize to JAVA Read and write function of object ; Can be used to serialize , Objects need to be implemented Serializable Interface
//java Object writing 
FileOutputStream fileStream = new FileOutputStream("example.txt");
ObjectOutputStream out = new ObjectOutputStream(fileStream);
Example example = new Example();
out.writeObject(example);

//java Object reading 
FileInputStream fileStream = new FileInputStream("example.txt");
ObjectInputStream in = new ObjectInputStream(fileStream);
Example = (Example) in.readObject();
  • PipedInputStream and PipedOutputStream Pipe flow , It is suitable for transferring data in two threads , A thread sends data through a piped output stream , Another thread reads the data through the pipe input stream , Realize the data communication between two threads
//  Create a sender object 
Sender sender = new Sender(); //  Create a receiver object 
Receiver receiver = new Receiver(); //  Get output pipeline flow 
//  Get input and output pipeline flow 
PipedOutputStream outputStream = sender.getOutputStream(); 
PipedInputStream inputStream = receiver.getInputStream();
//  Linking two pipes , This is an important step , Connect the input and output streams   
outputStream.connect(inputStream);
sender.start();//  Start sender thread 
receiver.start();//  Start recipient thread 
  • SequenceInputStream The multiple InputStream Merge into one InputStream, Allows applications to combine several input streams continuously
InputStream in1 = new FileInputStream("example1.txt");
InputStream in2 = new FileInputStream("example2.txt");
SequenceInputStream sequenceInputStream = new SequenceInputStream(in1, in2);
// data fetch 
int data = sequenceInputStream.read();
  • FilterInputStream and FilterOutputStream Decorator mode is used to add extra functionality to the stream , Subclass construction parameter requires one InputStream/OutputStream
ByteArrayOutputStream out = new ByteArrayOutputStream(2014);
// Data writing , Use DataOutputStream Decorate one InputStream
// Use InputStream Ability to process basic data 
DataOutputStream dataOut = new DataOutputStream(out);
dataOut.writeDouble(1.0);
// data fetch 
ByteArrayInputStream in = new ByteArrayInputStream(out.toByteArray());
DataInputStream dataIn = new DataInputStream(in);
Double data = dataIn.readDouble();
  • DataInputStream and DataOutputStream (Filter Subclass of flow ) Additional ability to process various basic types of data for other streams , Such as byte、int、String
  • BufferedInputStream and BufferedOutputStream (Filter Subclass of flow ) Add buffering for other streams
  • PushBackInputStream (FilterInputStream Subclass ) Push back the input stream , Some of the data read in can be rolled back into the buffer of the input stream
  • PrintStream (FilterOutputStream Subclass ) Print stream , The function is similar to System.out.print

2 JAVA.IO Character stream

 The basic chapter : A text for you to read JAVA.IO、 Character encoding 、URL and Spring.Resource

 

21.png

  • From the oriented graph of byte stream and character stream , They correspond to each other , such as CharArrayReader and ByteArrayInputStream
  • Conversion of byte stream and character stream :InputStreamReader Can be InputStream To Reader,OutputStreamReader Can be OutputStream To Writer
//InputStream To Reader
InputStream inputStream = new ByteArrayInputStream(" Program ".getBytes());
InputStreamReader reader = new InputStreamReader(inputStream, StandardCharsets.UTF_8);
//OutputStream To Writer
OutputStream out = new FileOutputStream("example.txt");
OutputStreamWriter writer = new OutputStreamWriter(out);
// Read and write in character units 
writer.write(reader.read(new char[2]));
  • difference : Byte stream read unit is byte , The read unit of character stream is character ; A character consists of bytes , Such as variable word length coding UTF-8 By 1~4 Byte representation

3 Garbled problem and character stream

  • Characters are represented by different codes , Its byte length ( The word is long ) It's different . Such as “ cheng ” Of utf-8 Coding format , from [-25][-88][-117] form . and ISO_8859_1 The encoding is a single byte [63]
  • Usually, the operation of resources is oriented to byte stream , However, when data resources are converted into bytes according to different byte codes , Their contents are different , It is easy to cause garbled code
  • Two kinds of random code encode and decode Inconsistent character encoding used : Use of resources UTF-8 code , And in code GBK The number of bytes read by byte stream when decoding is opened does not meet the character specified word length : Characters are made up of bytes , such as “ cheng ” Of utf-8 The format is three bytes ; If in InputStream Read the stream every two bytes , And then to String(java The default encoding is utf-8), There will be a garbled code ( Half Chinese , Guess what )
ByteArrayInputStream in = new ByteArrayInputStream(" Good procedure ".getBytes());
byte[] buf = new byte[2]; // Read two bytes of the stream 
in.read(buf); // Reading data 
System.out.println(new String(buf)); // The statement 
---result---- 
�  // The statement 
  • Random code scene 1, Know the character encoding of the resource , The corresponding character encoding can be used to decode and solve the problem
  • Random code scene 2, All bytes can be read at once , One time coding . But for large file streams , This is unrealistic , So there's a stream of characters
  • Byte stream use InputStreamReader、OutputStreamReader Convert to character stream , Where character encoding can be specified , And then it is processed by character , Can solve garbled code
InputStreamReader reader = 
      new InputStreamReader(inputStream, StandardCharsets.UTF_8);

4 The concept distinction between character set and character coding

  • The relationship between character set and character encoding , The character set is the specification , Character coding is the concrete implementation of the specification ; The character set specifies the unique correspondence between the symbol and the binary code value , However, there is no specific storage method specified ;
  • unicode、ASCII、GB2312、GBK All character sets ; among ASCII、GB2312、GBK Both character set and character encoding ; Be careful not to confuse the two ; and unicode The specific implementation of UTF-8,UTF-16,UTF-32
  • The first ASCII Code uses a byte (8bit) To specify the character and binary mapping relationship , standard ASCII The code specifies 128 Characters , In the English world , It's enough . But Chinese , How to map Japanese and other characters ? So other, larger character sets appear
  • unicode( Unified character set ), In the early days it was used 2 individual byte Express 1 Characters , The entire character set can hold 65536 Characters . However, it is still not enough , So it expanded to 4 individual byte Represents a character , At present, the scope of support is U+010000~U+10FFFF
  • unicode It's a mistake to say it's two bytes ;UTF-8 It's variable length , Need to use 1~4 Byte store ;UTF-16 It's usually two bytes (U+0000~U+FFFF Range ), If you encounter two bytes, you can't save it , Then use 4 Bytes ; and UTF-32 It's fixed four bytes
  • unicode The character represented , Will use “U+” start , It's followed by a hexadecimal number , Such as “ word ” The code of is U+5B57
  • UTF-8 Coding and unicode Character set

Range  Unicode(Binary) UTF-8 code (Binary) UTF-8 code byte length  U+0000~U+007F 00000000 00000000 00000000 0XXXXXXX 0XXXXXX 1 U+0080~U+07FF 00000000 00000000 00000YYY YYXXXXXX 110YYYYY 10XXXXXX 2 U+0800~U+FFFF 00000000 00000000 ZZZZYYYY YYXXXXXX 1110ZZZZ 10YYYYYY 10XXXXXX 3 U+010000~U+10FFFF 00000000 000AAAZZ ZZZZYYYY YYXXXXXX 11110AAA 10ZZZZZZ 10YYYYYY 10XXXXXX 4

  • The program is divided into internal code and external code ,java The default encoding for is UTF-8, In fact, it refers to the outer code ; Internal codes tend to use fixed length codes , A principle of alignment with memory , Easy to handle . Outer codes tend to use variable length codes , Variable length code encodes common characters into short codes , Long encoding of rare characters , Save storage space and transmission bandwidth
  • JDK8 String , It's using char[] To store characters ,char It's two bytes in size , Which uses UTF-16 code ( Internal code ). and unicode Chinese characters specified in U+0000~U+FFFF Inside , Therefore use char(UTF-16 code ) There will be no garbled code when storing Chinese
  • JDK9 after , String uses byte[] Array to store , Because there are some characters, one char It can't be saved , Such as emoji emoticons , Using bytes to store strings is easier to expand
  • JDK9, If the content of the string is ISO-8859-1/Latin-1 character (1 Characters 1 byte ), Then use ISO-8859-1/Latin-1 Encoding storage strings , Otherwise use UTF-16 Coded storage array (2 or 4 Bytes )
System.out.println(Charset.defaultCharset()); // Output java Default encoding 
for (byte item : " Program ".getBytes(StandardCharsets.UTF_16)) {
    System.out.print("[" + item + "]");
}
System.out.println("");
for (byte item : " Program ".getBytes(StandardCharsets.UTF_8)) {
    System.out.print("[" + item + "]");
}
----result----
UTF-8       //java Default encoding UTF-8
[-2][-1][122][11][94][-113] //UTF_16:6 Bytes ?
[-25][-88][-117][-27][-70][-113] //UTF_8:6 Bytes   normal 
  • “ Program ” Of UTF-16 Code is output 6 Bytes , Two more bytes , What's going on here ? Try one more character output
for (byte item : " cheng ".getBytes(StandardCharsets.UTF_16)) {
    System.out.print("[" + item + "]");
}
---result--
[-2][-1][122][11]
  • It can be seen that UTF-16 There are too many bytes encoded [-2][-1] Two bytes , Hex is 0xFEFF. And it's used to identify the coding order is Big endian still Little endian. In characters ' in ' For example , its unicode Hex is 4E2D, When the storage 4E before ,2D After , Namely Big endian;2D before ,4E After , Namely Little endian.FEFF Indicates that the storage adopts Big endian,FFFE Said the use of Little endian
  • Why? UTF-8 There is no problem with byte order ? Personal view , because UTF-8 It's getting longer , From the head of the first byte 0、110、1110、11110 Determine whether the next few bytes are needed to form a character , Use Big endian Easy to read processing , The reverse is not easy to deal with , Therefore, it is mandatory to use Big endian
  • In fact, I feel UTF-16 It can be mandatory to use Big endian; But there is a historical problem ...

5 URI A brief introduction to the concept

  • Given the java.io To manipulate the resource flow ; But for the resources of the network , How to open it , How to locate it ? answer URI-URL
  • URI The full name is Uniform Resource Identifier Uniform resource identifiers
  • Popular said , It's a string similar to the ID card number. , It's just that it's used to identify resources ( Such as : mailing address , Host name , Documents, etc. )
  • URI Have specific rules : [scheme]:[scheme-specific-part][#fragment] Further subdivision can be expressed as [scheme]:[//authority][/path][?query][#fragment], The mode specific part is authority and path、query; and authority It can be seen as a domain name , Such as www.baidu.com The ultimate subdivision is [scheme]:[//host:port][/path][?query][#fragment], Just as like as two peas.
  • Pattern specific parts (scheme-specific-part) The form depends on the pattern , and URI The common patterns of are as follows ftp:FTP The server file: Files on local disk http: Use Hypertext Transfer Protocol mailto: Email address telnet: be based on Telnet Service connection for Java Some non-standard customization patterns are also widely used in , Such as rmi、jar、jndi、doc、jdbc etc.
  • stay java in URI Abstract to java.net.URI class , Here are several common construction methods
// according to str Generate URI
public URI(String str) throws URISyntaxException
public URI(String scheme, String authority,
       String path, String query, String fragment)throws URISyntaxException
public static URI create(String str) // call  URI(String str)   
  • JAVA.URI Common operation methods of
public String getScheme()    // Access mode 
public String getSchemeSpecificPart()// Get mode specific parts 
public String getFragment()  // Get fragment identifier 
// The above three methods are universal 
public String getAuthority() // Authorized institutions , Such as www.baidu.com
public String getHost()      // Get host part , Such as 127.0.0.1
public int getPort()         // Such as 8080
public String getPath()      // Positioning path 
public String getQuery()     // Query criteria 

6 URL Concepts and URL The difference between

  • URL The full name is Uniform Resource Location, Uniform resource locator
  • URL Namely URI Subset , In addition to identifying resources , It also provides a path to find the resource ; stay Java Class library ,URI Class does not contain any methods to access resources , Its only function is to parse , and URL Class to open a flow to a resource
  • Of the same genus URI Subset of URN( Unified resource name ), Identify resource name only , It does not specify how to locate resources ; Such as :mailto:clswcl@gmail.com It's a kind of URN, I know it's a mailbox , But I don't know how to find the location
  • Popular is ,URN Tell you, there is a place called Guangzhou , But it didn't say how , You can take the train , You can also fly ;URL Will tell you to fly to Guangzhou , And the other URL It's about taking the train
  • URL General grammar rules of
 agreement :// Host name : port / route ? Inquire about # fragment 
[protocol]:[//host:port][/path][?query][#fragment]
  • URL Construction method of 、 Access method
// be based on URL Pattern construction URL example 
public URL(String spec) throws MalformedURLException
// among file amount to path、query and fragment Three parts 
public URL(String protocol, String host, int port, String file) throws MalformedURLException

// According to the class loader URL
URL systemResource = ClassLoader.getSystemResource(String name)
Enumeration<URL> systemResources = ClassLoader.getSystemResources(String name)
URL resource = Main.class.getResource(String name)
Enumeration<URL> resources = Main.class.getClassLoader().getResources(String name)
  • adopt URL Operation function for getting resource data
public final InputStream openStream() throws java.io.IOException
public URLConnection openConnection() throws java.io.IOException
public final Object getContent() throws java.io.IOException

7 Spring.Resource And Spring Resource acquisition method

  • Talking about resources , I have to talk about it Spring Access to resources , There are two common ones adopt Resource The subclass of interface gets resources through ResourceLoader Interface to get resources
  • Spring.Resource List of resource operation functions
// Determine whether the resource exists 
boolean exists(); //
// Returns the current resource corresponding to URL, If it cannot be resolved, an exception will be thrown ; Such as ByteArrayResource It can't be resolved into one URL
URL getURL() throws IOException;
// Returns the current resource corresponding to URI
URI getURI() throws IOException;
// Returns the current resource corresponding to File
File getFile() throws IOException;
// Return the corresponding ReadableByteChannel
default ReadableByteChannel readableChannel() throws IOException
  • introduce Resource Use of related subclasses
  • FileSystemResource: Getting resources through the file system
Resource resource = new FileSystemResource("D:/example.txt");
File file= new File("example.txt");
Resource resource2 = new FileSystemResource(file);
  • ByteArrayResource: obtain byte The resource represented by the array be based on ByteArrayInputStream And byte array implementation , The application scenario is similar ByteArrayInputStream, cache byte[] resources
  • ClassPathResource: Get the resources under the classpath
//ClassPathResource.java  Three attributes of 
private final String path;
// Use Class or ClassLoader load resources 
private ClassLoader classLoader;
private Class<?> clazz;

--- Usage mode ----
Resource resource = new ClassPathResource("test.txt");
  • InputStreamResource: Receive one InputStream object , Gets the resource encapsulated by the input stream
  • ServletContextResourse: load ServletContext In the environment ( be relative to Web Application root ) Path resources , Resources obtained
  • UrlResource: adopt URL visit http Resources and FTP Resources, etc

8 ResourceLoader Access to resources

 The basic chapter : A text for you to read JAVA.IO、 Character encoding 、URL and Spring.Resource

 

resource.png

  • ResourceLoader It's for shielding Resource The concrete realization of , Unified access to resources . You can start from ResourceLoader load ClassPathResource, Can also be loaded FileSystemResource etc.
public interface ResourceLoader {
  //  Resources loaded from the classpath by default   Prefix : "classpath:", obtain ClassPathResource
   String CLASSPATH_URL_PREFIX = ResourceUtils.CLASSPATH_URL_PREFIX;
  Resource getResource(String location);
  • ResourceLoader Interface default classpath The resources under the path are loaded
public interface ResourcePatternResolver extends ResourceLoader {
  //  All paths are loaded by default ( Include jar package ) The following file ,"classpath*:",  obtain ClassPathResource
  String CLASSPATH_ALL_URL_PREFIX = "classpath*:";
  • ResourcePatternResolver Files under all paths are loaded by default , get ClassPathResource;classpath: Only in class Search under class path ; and classpath*: Will scan all JAR Package and class Files in the classpath
//Ant Style expression   com/smart/**/*.xml 
ResourcePatternResoler resolver = new PathMatchingResourcePatternResolver();
Resource resources[] = resolver.getResources("com/smart/**/*.xml");

// ApplicationContext ctx 
//FileSystemResource resources 
Resource template = ctx.getResource("file:///res.txt");
//UrlResource resources 
Resource template = ctx.getResource("https://my.cn/res.txt");
  • ResourceLoader Method getResource Of locationPattern You can set the resource pattern prefix to get non ClassPathResource resources ,locationPattern Support Ant style

Prefix   Example   describe  classpath: classpath:config.xml Load from classpath file: file:///res.txt Load from file system FileSystemResource http: http://my.cn/res.txt load UrlResource

9 JAVA.Properties Get to know

  • Properties yes java Configuration processing class with ;Properties Two ways to load resources
public class Properties extends Hashtable<Object,Object>{
    .... // According to the Reader perhaps InputStream load properties The contents of the document 
    public synchronized void load(Reader reader) throws IOException
    public synchronized void load(InputStream inStream) throws IOException
  • Properties Read configuration sample code
//res.properties
username = root
password = password
------- Code example -------------
InputStream input = ClassLoader.getSystemResourceAsStream("res.properties");
Properties prop = new Properties();
prop.load(inputStream); // according to inputStream Load Resource 
String username = prop.getProperty("username");

10 yml Read configuration resources

  • Ordinary java If the project needs to be read yml Can be introduced jackson-dataformat-yaml, and springboot Default configuration support yml The read
<dependency>
  <groupId>com.fasterxml.jackson.dataformat</groupId>
  <artifactId>jackson-dataformat-yaml</artifactId>
  <version>2.9.5</version>
  • be based on jackson-dataformat-yaml Yes yml Read configuration resources
//res.yml  To configure 
name: chen
params:
  url:  http://www.my.com
  
---------- Code example ---------------
InputStream input = ClassLoader.getSystemResourceAsStream("res.yml");
Yaml yml = new Yaml();
Map map = new Yaml().loadAs(input, LinkedHashMap.class);; // according to inputStream Load Resource 
String name = MapUtils.getString(map,"name"); // chen
//url:  http://www.my.com

11 Closing resources gracefully ,try-with-resource Grammar and lombok@Cleanup

  • Opening resources requires corresponding closing , But we often forget to shut down resources , Or it's messy to close resources in multiple places , Is there a simple way to close it ?
  • To automatically close the resource class AutoCloseable Interface and coordination try-with-resource Grammar sugar use
public class YSOAPConnection implements AutoCloseable {
    private SOAPConnection connection;
    public static YSOAPConnection open(SOAPConnectionFactory soapConnectionFactory) throws SOAPException  {
        YSOAPConnection ySoapConnection = new YSOAPConnection();
        SOAPConnection connection = soapConnectionFactory.createConnection();
        ySoapConnection.setConnection(connection);
        return ySoapConnection;
    }
    public SOAPMessage call(SOAPMessage request, Object to) throws SOAPException {
        return connection.call(request, to); 
    }
    @Override
    public void close() throws SOAPException {
        if (connection != null) {  connection.close(); }
    }
}
// Examples of automatically closed resource classes 
try (YSOAPConnection soapConnection=YSOAPConnection.open(soapConnectionFactory)){
    SOAPMessage soapResponse = soapConnection.call(request, endpoint);
    ...// Data manipulation 
} catch (Exception e) {
    log.error(e.getMessage(), e);
    ...
}
  • lombok annotation @Cleanup, Called at the end of the object's life cycle public void close(); Object needs to implement AutoCloseable Interface
import lombok.Cleanup;
@Cleanup  // @Cleanup Use 
YSOAPConnection soapConnection=YSOAPConnection.open(soapConnectionFactory)

12 Resource not closed , What's the worst result

  • JDK The native resource class of is not closed , And it won't last forever .JVM With the help of finalize Turn it off automatically , for example FileInputStream
//FileInputStream.java - JDK8
//jdk8 Of FileInputStream Rewrote finalize, Ensure that the resources opened before object recycling are closed 
protected void finalize () throws IOException {
    if (guard != null) {
        guard.warnIfOpen();
    }
    if ((fd != null) && (fd != FileDescriptor.in)) {
        close();
    }
}
  • stay JDK9 after , use Cleaner Mechanism instead finalize Mechanism ;Cleaner Objects that are automatically recycled by the mechanism also need to be implemented AutoCloseable Interface ;Cleaner Is based on PhantomReference Realized ; Students interested in implementation details , You can refer to the relevant documents by yourself
  • But use JDK Resources provided by the closure mechanism , It takes a long time to shut down the resource manually . According to the test , Use try-with-resources close resource , And let the garbage collector recycle it in time 12 nanosecond . While using finalizer Mechanism , Time increased to 550 nanosecond
  • Not closing resources in time , It takes up resources , Affect the execution of other threads ; such as linux File resources for ,linux The maximum number of files a process can open by default is 1024( There are plenty of them 2048, This value is configurable ); If a thread holds more than a dozen file resources , Also wait 550 Nanosecond finalizer Mechanism releases resources , The other threads in the same process are waiting to die


author :clswcl
link :https://juejin.im/post/6856266775022174222
source : Nuggets


版权声明
本文为[osc_ao91jbnq]所创,转载请带上原文链接,感谢

Scroll to Top