The file size determines how long it takes for your page to load; the larger the file size, which is also increased due to a high image resolution quality, the longer it will take. People rarely have the patience to wait for long web page loads, so keeping your file sizes small reduces the time it takes to access your website. Big, high-quality images should typically be held between 100K and 60K in size. Smaller photos should be about 30K or less.
Fortunately, many of the most widely used file forms on the Internet have compression capabilities. When you save a file as one of these formats, the data information in the image file is compressed. This information can be decompressed by Internet browsers in order to view the image on the screen. You can set the compression rate in some graphic software programs to monitor the image quality (and file size) when you save it. Depending on how you want to use the photos on your web, you can need to play around with this to find the best ratio that keeps the resolution quality high while keeping the file size small.

It implies that you are not required to do so and that you are free to use various image sizes during your training. What you’ve written so far is merely network configuration. There should also be a complete network specification. The network resolution is determined by the height and width. Check out this example for an example of how it holds the aspect ratio.
Before preparation, it is very common to resize pictures. The scale of 416×416 is slightly larger than average. Most imagenet templates, for example, resize and square the images to 256×256. And I’m expecting the same thing here. Trying to train on 6000×4000 would necessitate a GPU farm. The typical method is to square the image to the largest dimension (height or width), padding the shorter side with 0’s, and then resizing with standard image resizing software like PIL.
In both inference and training, the darknet api adjusts the size of the images by default, but any input size w, h = 32 x X where X is a natural number should work, W is the width, H is the height. Since X = 13 by default, the input size is w, h = (416, 416). In opencv, I use this rule with yolov3, and the larger the X, the better.

It’s usually not a concern if your photos are a little smaller than recommended; they’ll just have some extra room on the sides. However, if your photos are much smaller than recommended, there would be a lot of extra space, which may be unappealing. Consider swapping templates or switching to a different segment!
It’s not a concern if the dimensions are too wide but the file size is less than 15MB! We’ll scale down images automatically so they load easily and your visitors don’t have to download overly large images.

