When refactoring is needed
09.07.2017 at 03:19 #4095
Васильев Владимир СергеевичKeymaster
The topic we chose to discuss in this article is not less important issue than the notion of refactoring. – It is about when refactoring should be done.
In other words, how to determine when our code needs refactoring?
There are lots of features that can help you to determined that the way to improve your code will take a long way, working for hours, and it will require kilometers of rewritten code… Of course, I’m exaggerating, but anything can happen))
So, we will look at the list of these features, and I am sure many will remember them from their own code, as they are used to write it.
If you look more precisely at the definition of some classes in your code , then probably you will notice that you can meet certain parts of the code several times, or at least more than once , in the same class. These can be just small parts of algorithms, and sometimes the entire algorithms. For example, data sorting, searching, arithmetic calculation, etc.
In such cases it is necessary to extract the repeating part of the algorithm and put into a separate method, then you can simply refer to it when needed, and the code will become more concise.
The rule that is often forgotten when you are so much into the process of coding, and the happy news is that the work is going well, and everything seems like it should be: Each method should have a single task.
Method, which includes a couple of hundred lines of code or even more – is a good reason to think, whether it is busy with too many tasks.
Another good sign you could use in this case – comments in the method. When there are several points of the same comments which include comments, it may indicate that the method performs more than one algorithm.
What you should do in this case – devide this method into more methods, each performing only one task.
Another reason to think about refactoring is bulk classes. Then we must first consider why they got so voluminous. Aren’t there too many attributes in a class? And if it is so, are they really all characteristics of this type object ? Or we are using some other class’s attributes by mistake?
Aren’t there too many methods defined in the class? And, if it is so, whether all of them are the behavioral characteristics of this type objects or we should have placed them to some other class?
How to avoid this situation – plan the structure of classes and data types more precisely . Only those variables which are attributes of a class object itself must be class fields. Only those methods which are behavioral characteristics of the class objects must be places within the class.
This often happens due to the lack of planning of application architecture. The code looks solid, its structure is not designed for any changes, extensions, just for one-time use. All classes are rigidly connected to each other, you can not find any use of abstraction. Every attempt to make any changes to any part of the code and opps – all the code is literally falling apart. So now it’s up to you – whether you want to use it or you want to rewrite it from scratch.
The solution is actually simple – work more properly on code desig in terms of its flexibility and extensibility- do not forget about the OOP principles (abstraction and polymorphism in particular).
Sometimes it happens that some functions in a class require for their algorithms data which is not present in the current class and was not even planned in parameters of the method. So the question follows – how we can get this data for the method.
As a result – we have to add extra fields to the class, which was not initially planned, or we have to expand the list of the function parameters, and then maybe search for the way to pass them there, because at the call point they are missing too.
Then there comes an assumption that most likely, it’s not the data that you need to take out of nowhere, but the function itself does not belong to this class. So it’s not about missing data – it’s about the function placed in a wrong class.
The solution is to remove the function and place it to the appropriate class.
Just in case code
When the work on a program begins not with planning its design and thinking on the architecture and the algorithm, but with wrighting code (sounds familiar, huh?), it is difficult to know for sure what data types will be defined and what fields and functions which will be needed. Which means the following work on the code will be rather chaotic, when some classes are defined generally empty, because we can’t decide what exactly will shall use them for. Class fields are also created with no certain purpose, it is doubtful whether they will be usefull in future (but it is better to write it just in case).
And this chaos leads to the fact that there is a bunch of idle code, which is just hanging in the application and only consuming memory and confusing developers who are trying to realise for which purpose it was intended.
The solution – do not write a class, or a method, or a variable, unless you know exactly for what they are needed in the code.
Sometimes you may notice that some of the data fields or method parameters are often found in the code in groups. As if they were inseparable, and meaningless without one another.
This is a signal to extracting the data to a separate class or a structure, and then they will be obviously associated, not only semantically but also syntactically, and we will reduce the list of method parameter.
Now we continue to discuss the reasons for code refactoring. In the first part of the article we started to list the most popular situations when it makes sense to reconsider some aspects of the code design. Here are some more ….
Perhaps this title will confuse the reader, because presence of comments in code is considered to be a sign of a good code style. But let’s think about why we can find comments in somebody’s code:
- if the code is well-designed
- if the author is not sure if he won’t be confused looking into his own code the next day
- if the author is trying to distinguish between large sections of code within a single method
- if the author has performed refactoring and forgot to remore the unnecessary comments left after that
Now let’s think, which of the above points to the need for refactoring. That’s right, all except for the first. And why so? Because they all indicate the presence of flaws in code.
This flaw of a code is not only typical for beginners, but also for professional developers. It lies in the fact that base types (particularly classes) provide too much unnecessary information for successors. Variables and methods of the base class, which are not used and are unlikely to be involved in the child classes . It is a violation of one of SOLID principles – interface segregation principle.
As usual, when you try to make something better, it turns into a problem.
Solution – creation of “thin” interfaces to inherit.
This happens when you directly specify some file addresses (for example, images, audio files, video, databases) that are not built in resources of the application.
For example, we have a class that stored an address of some image in its field , and it looks like this:
string path = @ “D: \ Images \ MyPhoto.jpg”;
or like this:
string path1 = @ “C: \ Users \ admin \ Documents \ Visual Studio 2010 \ Projects \ CoolApp \ Images \ MyPhoto.jpg”;
In the first case, this is an address of a file that is not even stored in the application folder, in the second – it is stored in a subdirectory of the application. What’s wrong with this approach?
The fact is that when a file is located outside of the application, it can be accidentally or intentionally deleted or moved, or the folder can be renamed, and then the address specified in the variable path, becomes irrelevant. In the second case, the relevance is lost if we, for example, move the project folder to a different directory.
Thus, when using this approach we are bound to a specific computer , and even in that case it only works under condition that the addresses of files and folders are inviolable.
Solution – use the address strings in unified format.
Multipurpose and useless
This is the case if in the pursuit of flexibility and extensibility we get a class, with all sorts of interfaces, delegates, but performing nothing obvious, in fact. That happens when our efforts have the opposite effect and the so-called “universality” dispels the class purpose. When you look at this class, the question arises: what is the class doing? The authors of this code will respond in an injured voice : it can do everything !!!! It knows how to work with all types of data through the parameters of the interface type, it is able to perform any calculations, which will be set through the appropriate delegate !!!!
In other words – nothing specific…
So why write such code and who will use its benefit? It is better to create classes that are suitable for specific cases and are operating a clear set of actions that you want the task. I am not saying that a certain flexibility of the code is a bad idea, but inventing super – classes, suitable for all occasions – is a totally useless work.
Solution: classes and other types (interfaces, structures, delegates, enumerations), should be created in most cases for a specific purpose (which does not always mean – for a specific application).
This is a consequence of design patterns possession, and I have already posted an article on this topic on my blog , it was is called “Design patterns overuse , or a smelling code“.
To tell in brief, it is improper use of existing architectural solutions, a complete ignoring of their essence. In this case, the main indicator of the “quality” is the amount of applied design patterns, rather than the appropriateness of their use.
Solution – a closer look at the essence of used design patterns and compare it with a specific case in the code where you are using them.
The main disadvantage of the switch statement is its inability to be extended: once you specify how many cases is – you can never add more. And if we assume that each case is using some data type , it turns into a typical “hardcoding”. And I want to ask, what would happen if there were new options added to the application or any of the current were to be updated? We would have to rewrite the method, which as we know is not allowed.
A good example, when the switch operator or its analog in some situations – conditional statement (if) – should definitely be replaced by applying polymorphism, you can find in my article dedicated to the Open-closed principle (SOLID).
Sometimes in a class you can find definition of a function that performs just one operation, or simply calls another function (whose algorithm is not used anywhere else in the class). And nothing more.
It’s hard to say why it is done like that. Rather i would think – it’s a literally perceived rule that each function must performs olny one task. The word “task” might mean an algorithm or a calculation that is performed frequently, or at least more than once, so it is easier to allocate it in a separate function and call where you want.
When a method of the class is written to call the second method of the same class, and in it the third method of the same class – it is strange.
Neither you should possessively split algorithms into smaller parts, defining each in a separate function. For example, when calculating the arithmetic average, there is no need to place calculation of a sum in one function and division by the number of elements to another.
At this point I will stop , although there are much more reasons for code refactoring than listed above, and obviously the reason is that there are no limits to idiocy))) just kidding, of course.
You must be logged in to reply to this topic.